Normal forms
Why do we need to normalize?
• To avoid redundancy (less storage space needed, and data is
consistent)
• To avoid Insert, update/delete anomalies
First normal form
• No Multivalued Attribute:In below ex Primary key is Primary
key(Sid,course)
• Only single value attribute is allowed
• Attributes must hold atomic or single values
• Table-1: Not in 1st Normal form since course:c,java for sid:1
• Table-2:Its in 1st Normal form since Course is holding only 1 value
sid sname course
1 ravi C,Java
2 rani Python
3 raju DBMS
sid sname course
1 ravi C
1 ravi Java
2 rani Python
3 raju DBMS
Second Normal form
• The Table or Relation should be in 1st normal form
• All Non-Prime attributes should be fully functionally dependent on
Candidate key. Non-Prime attributes should not be partially
dependent on Candidate key.
• Candidate Key It is a set of attributes that uniquely identify tuples in
table. Candidate key should not have repeated attribute values.
• Non-Prime Attributes of the relation which does not exist in any of
the possible candidate keys of the relation, such attributes are called
Non-Prime Attributes also called Non Key attributes
Example for 2nd Normal Form
Customer ID Store ID Location
1 1 Delhi
1 3 Mumbai
2 1 Delhi
3 2 Bangalore
4 3 Mumbai
Candidate Key:
Store ID and Customer Id Location is Partially dependent on Candidate key so its not in 2nd
Normal form
Non-Prime Key:Location, (Table is not in 2nd normal form)
2nd normal form : Table is in 2nd normal form
Customer ID Store ID
1 1
1 3
2 1
3 2
4 3
Store ID Location
1 Delhi
3 Mumbai
1 Delhi
2 Bangalore
3 Mumbai
3rd Normal Form
• Definition of 3NF: To fulfill 3NF a table should fulfill 2NF and in
addition no non-key attribute should be FFD of any other non-key
attribute. All non prime attributes must depend on Prime Attributes
The third normal form says that a table,
• Should be in 2NF,
• should not include any transitive dependencies to non-key attributes.
Thus, it is not allowed to be any arrows between attributes outside
the primary key, only from the primary key to attributes outside. This
means that if there is a combined primary key one is allowed to have
arrows that points to one of the attributes in the key
• A B C=A C which is wrong since B is non prime
Emp_ID Emp_name Emp_ZIP Emp_state Emp_city Emp_Distr
1001 John 282005 UP Agra Dayal Bagh
1002 Ajeet 222008 TN Chennai M-City
1006 Lora 282007 TN Chennai Urrapakkam
1201 Steve 222999 MP Gwalior Ratan
• Super keys: {emp_id}, {emp_id, emp_name}, {emp_id, emp_name,
emp_zip}…so on
• Candidate Keys: {emp_id}
• Non-prime attributes: all attributes except emp_id are non-prime as
they are not part of any candidate keys.
• Here, emp_state, emp_city & emp_district dependent on emp_zip.
And, emp_zip is dependent on emp_id that makes non-prime
attributes (emp_state, emp_city & emp_district) transitively
dependent on super key (emp_id). This violates the rule of 3NF.
To make this table complies with 3NF we have to break the
table into two tables to remove the transitive dependency:
Emp_ID Emp_name Emp_ZIP
1001 John 282005
1002 Ajeet 222008
1006 Lora 282007
1201 Steve 222999
Emp_ZIP Emp_state Emp_city Emp_Distr
282005 UP Agra Dayal Bagh
222008 TN Chennai M-City
282007 TN Chennai Urrapakkam
222999 MP Gwalior Ratan
Boyce Codd normal form (BCNF)
• Boyce Codd normal form (BCNF).
• It is an advance version of 3NF that’s why it is also referred as 3.5NF.
BCNF is stricter than 3NF.
• A table complies with BCNF if it is in 3NF and for every functional
dependency X->Y, X should be the super key of the table.
• Super key: It is single or group of keys in a table which identifies a
tuple
Example
SID Subject Professor ID Professor
1 DBMS 149 Ravi
2 Java 169 Ramya
3 OS 159 Laxmi
4 Python 149 Ravi
5 C 129 Ramya
• SID Subject(SID,Subject)Primary key Attribute
• Subject (Prime Attribute) Professor(Non Prime Attribute)
4th Normal form
Properties of Relational Decomposition
• When a relation in the relational model is not appropriate normal
form then the decomposition of a relation is required. In a database,
breaking down the table into multiple tables termed as
decomposition. The properties of a relational decomposition are
listed below :
• Attribute Preservation:
• Dependency Preservation:
• Non Additive Join Property:
• No redundancy
• Lossless Join
Attribute Preservation:
• Using functional dependencies the algorithms decompose the
universal relation schema R in a set of relation schemas D = { R1, R2,
….. Rn } relational database schema, where ‘D’ is called the
Decomposition of R.The attributes in R will appear in at least one
relation schema Ri in the decomposition, i.e., no attribute is lost. This
is called the Attribute Preservation condition of decomposition.
Dependency Preservation:
• If each functional dependency X->Y specified in F appears directly in
one of the relation schemas Ri in the decomposition D or could be
inferred from the dependencies that appear in some Ri. This is
the Dependency Preservation.If a decomposition is not dependency
preserving some dependency is lost in decomposition. To check this
condition, take the JOIN of 2 or more relations in the decomposition.
Dependency Preservation Example
For example:
R = (A, B, C) F = {A ->B, B->C} Key = {A} R is not in BCNF.
Decomposition R1 = (A, B), R2 = (B, C)
R1 and R2 are in BCNF, Lossless-join decomposition, Dependency preserving.
Each Functional Dependency specified in F either
appears directly in one of the relations in the decomposition.
It is not necessary that all dependencies from the relation R appear
in some relation Ri.
It is sufficient that the union of the dependencies
on all the relations Ri be equivalent to the dependencies on R.
Non Additive Join Property
• Another property of decomposition is that D should possess is
the Non Additive Join Property, which ensures that no spurious tuples
are generated when a NATURAL JOIN operation is applied to the
relations resulting from the decomposition.
No redundancy
• Decomposition is used to eliminate some of the problems of bad
design like anomalies, inconsistencies, and redundancy.If the relation
has no proper decomposition, then it may lead to problems like loss
of information.
Lossless Join:
• Lossless join property is a feature of decomposition supported by
normalization. It is the ability to ensure that any instance of the
original relation can be identified from corresponding instances in the
smaller relations.For example:
R : relation, F : set of functional dependencies on R,
X, Y : decomposition of R,
A decomposition {R1, R2, …, Rn} of a relation R is called a lossless
decomposition for R if the natural join of R1, R2, …, Rn produces
exactly the relation R.
• A decomposition is lossless if we can recover:
R(A, B, C) -> Decompose -> R1(A, B) R2(A, C) -> Recover -> R’(A, B, C)
Thus, R’ = R
cont
• Decomposition is lossless if:
X intersection Y -> X, that is: all attributes common to both X and Y
functionally determine ALL the attributes in X.
X intersection Y -> Y, that is: all attributes common to both X and Y
functionally determine ALL the attributes in Y
If X intersection Y forms a superkey of either X or Y, the
decomposition of R is a lossless decomposition.

Normal forms.ppt

  • 1.
  • 2.
    Why do weneed to normalize? • To avoid redundancy (less storage space needed, and data is consistent) • To avoid Insert, update/delete anomalies
  • 3.
    First normal form •No Multivalued Attribute:In below ex Primary key is Primary key(Sid,course) • Only single value attribute is allowed • Attributes must hold atomic or single values • Table-1: Not in 1st Normal form since course:c,java for sid:1 • Table-2:Its in 1st Normal form since Course is holding only 1 value sid sname course 1 ravi C,Java 2 rani Python 3 raju DBMS sid sname course 1 ravi C 1 ravi Java 2 rani Python 3 raju DBMS
  • 4.
    Second Normal form •The Table or Relation should be in 1st normal form • All Non-Prime attributes should be fully functionally dependent on Candidate key. Non-Prime attributes should not be partially dependent on Candidate key. • Candidate Key It is a set of attributes that uniquely identify tuples in table. Candidate key should not have repeated attribute values. • Non-Prime Attributes of the relation which does not exist in any of the possible candidate keys of the relation, such attributes are called Non-Prime Attributes also called Non Key attributes
  • 5.
    Example for 2ndNormal Form Customer ID Store ID Location 1 1 Delhi 1 3 Mumbai 2 1 Delhi 3 2 Bangalore 4 3 Mumbai Candidate Key: Store ID and Customer Id Location is Partially dependent on Candidate key so its not in 2nd Normal form Non-Prime Key:Location, (Table is not in 2nd normal form)
  • 6.
    2nd normal form: Table is in 2nd normal form Customer ID Store ID 1 1 1 3 2 1 3 2 4 3 Store ID Location 1 Delhi 3 Mumbai 1 Delhi 2 Bangalore 3 Mumbai
  • 7.
    3rd Normal Form •Definition of 3NF: To fulfill 3NF a table should fulfill 2NF and in addition no non-key attribute should be FFD of any other non-key attribute. All non prime attributes must depend on Prime Attributes The third normal form says that a table, • Should be in 2NF, • should not include any transitive dependencies to non-key attributes. Thus, it is not allowed to be any arrows between attributes outside the primary key, only from the primary key to attributes outside. This means that if there is a combined primary key one is allowed to have arrows that points to one of the attributes in the key • A B C=A C which is wrong since B is non prime
  • 8.
    Emp_ID Emp_name Emp_ZIPEmp_state Emp_city Emp_Distr 1001 John 282005 UP Agra Dayal Bagh 1002 Ajeet 222008 TN Chennai M-City 1006 Lora 282007 TN Chennai Urrapakkam 1201 Steve 222999 MP Gwalior Ratan
  • 9.
    • Super keys:{emp_id}, {emp_id, emp_name}, {emp_id, emp_name, emp_zip}…so on • Candidate Keys: {emp_id} • Non-prime attributes: all attributes except emp_id are non-prime as they are not part of any candidate keys. • Here, emp_state, emp_city & emp_district dependent on emp_zip. And, emp_zip is dependent on emp_id that makes non-prime attributes (emp_state, emp_city & emp_district) transitively dependent on super key (emp_id). This violates the rule of 3NF.
  • 10.
    To make thistable complies with 3NF we have to break the table into two tables to remove the transitive dependency: Emp_ID Emp_name Emp_ZIP 1001 John 282005 1002 Ajeet 222008 1006 Lora 282007 1201 Steve 222999 Emp_ZIP Emp_state Emp_city Emp_Distr 282005 UP Agra Dayal Bagh 222008 TN Chennai M-City 282007 TN Chennai Urrapakkam 222999 MP Gwalior Ratan
  • 11.
    Boyce Codd normalform (BCNF) • Boyce Codd normal form (BCNF). • It is an advance version of 3NF that’s why it is also referred as 3.5NF. BCNF is stricter than 3NF. • A table complies with BCNF if it is in 3NF and for every functional dependency X->Y, X should be the super key of the table. • Super key: It is single or group of keys in a table which identifies a tuple
  • 12.
    Example SID Subject ProfessorID Professor 1 DBMS 149 Ravi 2 Java 169 Ramya 3 OS 159 Laxmi 4 Python 149 Ravi 5 C 129 Ramya
  • 13.
    • SID Subject(SID,Subject)Primarykey Attribute • Subject (Prime Attribute) Professor(Non Prime Attribute)
  • 14.
  • 15.
    Properties of RelationalDecomposition • When a relation in the relational model is not appropriate normal form then the decomposition of a relation is required. In a database, breaking down the table into multiple tables termed as decomposition. The properties of a relational decomposition are listed below : • Attribute Preservation: • Dependency Preservation: • Non Additive Join Property: • No redundancy • Lossless Join
  • 16.
    Attribute Preservation: • Usingfunctional dependencies the algorithms decompose the universal relation schema R in a set of relation schemas D = { R1, R2, ….. Rn } relational database schema, where ‘D’ is called the Decomposition of R.The attributes in R will appear in at least one relation schema Ri in the decomposition, i.e., no attribute is lost. This is called the Attribute Preservation condition of decomposition.
  • 17.
    Dependency Preservation: • Ifeach functional dependency X->Y specified in F appears directly in one of the relation schemas Ri in the decomposition D or could be inferred from the dependencies that appear in some Ri. This is the Dependency Preservation.If a decomposition is not dependency preserving some dependency is lost in decomposition. To check this condition, take the JOIN of 2 or more relations in the decomposition.
  • 18.
    Dependency Preservation Example Forexample: R = (A, B, C) F = {A ->B, B->C} Key = {A} R is not in BCNF. Decomposition R1 = (A, B), R2 = (B, C) R1 and R2 are in BCNF, Lossless-join decomposition, Dependency preserving. Each Functional Dependency specified in F either appears directly in one of the relations in the decomposition. It is not necessary that all dependencies from the relation R appear in some relation Ri. It is sufficient that the union of the dependencies on all the relations Ri be equivalent to the dependencies on R.
  • 19.
    Non Additive JoinProperty • Another property of decomposition is that D should possess is the Non Additive Join Property, which ensures that no spurious tuples are generated when a NATURAL JOIN operation is applied to the relations resulting from the decomposition.
  • 20.
    No redundancy • Decompositionis used to eliminate some of the problems of bad design like anomalies, inconsistencies, and redundancy.If the relation has no proper decomposition, then it may lead to problems like loss of information.
  • 21.
    Lossless Join: • Losslessjoin property is a feature of decomposition supported by normalization. It is the ability to ensure that any instance of the original relation can be identified from corresponding instances in the smaller relations.For example: R : relation, F : set of functional dependencies on R, X, Y : decomposition of R, A decomposition {R1, R2, …, Rn} of a relation R is called a lossless decomposition for R if the natural join of R1, R2, …, Rn produces exactly the relation R. • A decomposition is lossless if we can recover: R(A, B, C) -> Decompose -> R1(A, B) R2(A, C) -> Recover -> R’(A, B, C) Thus, R’ = R
  • 22.
    cont • Decomposition islossless if: X intersection Y -> X, that is: all attributes common to both X and Y functionally determine ALL the attributes in X. X intersection Y -> Y, that is: all attributes common to both X and Y functionally determine ALL the attributes in Y If X intersection Y forms a superkey of either X or Y, the decomposition of R is a lossless decomposition.