DBMS (UNIT 2)

DBMS (UNIT 2)
BY:SURBHI SAROHA

SYLLABUS
 Relational Database Design: Basic terminologies
 Integrity constraints
 Functional Dependency
 Different anomalies in designing a database
 Decomposition and its properties
 Normalization using functional dependencies(1NF,2NF,3NF,BCNF)
 Normalization using multi-valued dependencies(4NF,5NF).

Relational Database Design: Basic
terminologies
 Attribute: Attributes are the properties that define an entity.
e.g.; ROLL_NO, NAME, ADDRESS
 Relation Schema: A relation schema defines the structure of the relation and
represents the name of the relation with its attributes. e.g.; STUDENT
(ROLL_NO, NAME, ADDRESS, PHONE, and AGE) is the relation schema for
STUDENT. If a schema has more than 1 relation, it is called Relational Schema.
 Tuple: Each row in the relation is known as a tuple.
 Relation Instance: The set of tuples of a relation at a particular instance of
time is called a relation instance. Degree: The number of attributes in the
relation is known as the degree of the relation.
 Cardinality: The number of tuples in a relation is known as cardinality.
 Column: The column represents the set of values for a particular attribute.

Cont….
 NULL Values: The value which is not known or unavailable is called a NULL
value. It is represented by blank space. e.g.; PHONE of STUDENT having
ROLL_NO 4 is NULL.
 Relation Key: These are basically the keys that are used to identify the rows
uniquely or also help in identifying tables. These are of the following types.
 Primary Key
 Candidate Key
 Super Key
 Foreign Key
 Alternate Key
 Composite Key

Integrity constraints
 Integrity constraints are a set of rules. It is used to maintain the quality of
information.
 Integrity constraints ensure that the data insertion, updating, and other
processes have to be performed in such a way that data integrity is not
affected.
 Thus, integrity constraint is used to guard against accidental damage to the
database.

1. Domain constraints
 Domain constraints can be defined as the definition of a valid set of values for
an attribute.
 The data type of domain includes string, character, integer, time, date,
currency, etc. The value of the attribute must be available in the
corresponding domain.

Entity integrity constraints
 The entity integrity constraint states that primary key value can't be null.
 This is because the primary key value is used to identify individual rows in
relation and if the primary key has a null value, then we can't identify those
rows.
 A table can contain a null value other than the primary key field.

Referential Integrity Constraints
 A referential integrity constraint is specified between two tables.
 In the Referential integrity constraints, if a foreign key in Table 1 refers to
the Primary Key of Table 2, then every value of the Foreign Key in Table 1
must be null or be available in Table 2.

Referential Integrity Constraints

Key constraints
 Keys are the entity set that is used to identify an entity within its entity set
uniquely.
 An entity set can have multiple keys, but out of which one key will be the
primary key. A primary key can contain a unique and null value in the
relational table.

Functional Dependency
 The functional dependency is a relationship that exists between two
attributes.
 It typically exists between the primary key and non-key attribute within a
table.
 X → Y
 The left side of FD is known as a determinant, the right side of the production
is known as a dependent.
 For example:
 Assume we have an employee table with attributes: Emp_Id, Emp_Name,
Emp_Address.

Cont…..
 Here Emp_Id attribute can uniquely identify the Emp_Name attribute of
employee table because if we know the Emp_Id, we can tell that employee
name associated with it.
 Functional dependency can be written as:
 Emp_Id → Emp_Name

Different anomalies in designing a
database
 Anomaly means inconsistency in the pattern from the normal form. In
Database Management System (DBMS), anomaly means the inconsistency
occurred in the relational table during the operations performed on the
relational table.
 There can be various reasons for anomalies to occur in the database. For
example, if there is a lot of redundant data present in our database then
DBMS anomalies can occur. If a table is constructed in a very poor manner
then there is a chance of database anomaly. Due to database anomalies, the
integrity of the database suffers.
 The other reason for the database anomalies is that all the data is stored in a
single table. So, to remove the anomalies of the database, normalization is
the process which is done where the splitting of the table and joining of the
table (different types of join) occurs.

There can be three types of an
anomaly in the database:
 1. Updation / Update Anomaly
 When we update some rows in the table, and if it leads to the inconsistency
of the table then this anomaly occurs.
 This type of anomaly is known as an updation anomaly.
 In the above table, if we want to update the address of Ramesh then we will
have to update all the rows where Ramesh is present.
 If during the update we miss any single row, then there will be two addresses
of Ramesh, which will lead to inconsistent and wrong databases.

Cont….
 2.Insertion Anomaly
 If there is a new row inserted in the table and it creates the inconsistency in
the table then it is called the insertion anomaly. For example, if in the above
table, we create a new row of a worker, and if it is not allocated to any
department then we cannot insert it in the table so, it will create an insertion
anomaly.
 3.Deletion Anomaly
 If we delete some rows from the table and if any other information or data
which is required is also deleted from the database, this is called the deletion
anomaly in the database. For example, in the above table, if we want to
delete the department number ECT669 then the details of Rajesh will also be
deleted since Rajesh's details are dependent on the row of ECT669. So, there
will be deletion anomalies in the table.

Decomposition and its properties
 The term decomposition refers to the process in which we break down a table
in a database into various elements or parts.
 Thus, decomposition replaces a given relation with a collection of various
smaller relations.
 Thus, in a database, we can make any table break down into multiple tables
when we want to collect a particular set of data.
 Decomposition must always be lossless.
 This way, we can rest assured that the data/information that was there in the
original relation can be reconstructed accurately on the basis of the
decomposed relations. In case the relation is not decomposed properly, then
it may eventually lead to problems such as information loss.

Types of Decomposition
 Decomposition is of two major types in DBMS:
 Lossless
 Lossy
 1. Lossless Decomposition
 A decomposition is said to be lossless when it is feasible to reconstruct the
original relation R using joins from the decomposed tables. It is the most
preferred choice. This way, the information will not be lost from the relation
when we decompose it. A lossless join would eventually result in the original
relation that is very similar.

Cont….
 2. Lossy Decomposition
 Just like the name suggests, whenever we decompose a relation into multiple
relational schemas, then the loss of data/information is unavoidable
whenever we try to retrieve the original relation.

Properties of Decomposition
 Decomposition must have the following properties:
 1. Decomposition Must be Lossless
 2. Dependency Preservation
 3. Lack of Data Redundancy
 1. Decomposition Must be Lossless
 Decomposition must always be lossless, which means the information must
never get lost from a decomposed relation. This way, we get a guarantee that
when joining the relations, the join would eventually lead to the same
relation in the result as it was actually decomposed.

Cont….
 2. Dependency Preservation
 Dependency is a crucial constraint on a database, and a minimum of one
decomposed table must satisfy every dependency.
 If {P → Q} holds, then two sets happen to be dependent functionally.
 Thus, it becomes more useful when checking the dependency if both of these
are set in the very same relation.
 This property of decomposition can be done only when we maintain the
functional dependency.
 Added to this, this property allows us to check various updates without having
to compute the database structure’s natural join.

Cont….
 3. Lack of Data Redundancy
 It is also commonly termed as a repetition of data/information.
 According to this property, decomposition must not suffer from data
redundancy.
 When decomposition is careless, it may cause issues with the overall data in
the database.
 When we perform normalization, we can easily achieve the property of lack
of data redundancy.

Normalization using functional
dependencies(1NF,2NF,3NF,BCNF)
 If a table has data redundancy and is not properly normalized, then it will be
difficult to handle and update the database, without facing data loss. It will
also eat up extra memory space and Insertion, Update and
Deletion Anomalies are very frequent if database is not normalized.
 Normalization is the process of minimizing redundancy from a relation or set
of relations.
 Redundancy in relation may cause insertion, deletion and update anomalies.
So, it helps to minimize the redundancy in relations.
 Normal forms are used to eliminate or reduce redundancy in database tables.

There are various level of normalization.
These are some of them:
 1. First Normal Form (1NF)
 2. Second Normal Form (2NF)
 3. Third Normal Form (3NF)
 4. Boyce-Codd Normal Form (BCNF)
 5. Fourth Normal Form (4NF)
 6. Fifth Normal Form (5NF)

First Normal Form (1NF):
 If a relation contains a composite or multi-valued attribute, it violates the
first normal form, or the relation is in first normal form if it does not contain
any composite or multi-valued attribute. A relation is in first normal form if
every attribute in that relation is singled valued attribute.
 A table is in 1 NF if:
 There are only Single Valued Attributes.
 Attribute Domain does not change.
 There is a unique name for every Attribute/Column.
 The order in which data is stored does not matter.

Example-1:
 Relation STUDENT in table 1 is not in 1NF because of multi-valued attribute
STUD_PHONE. Its decomposition into 1NF has been shown in table 2.

Second Normal Form (2NF)
 In the 2NF, relational must be in 1NF.
 In the second normal form, all non-key attributes are fully functional
dependent on the primary key
 Example: Let's assume, a school can store the data of teachers and the
subjects they teach. In a school, a teacher can teach more than one subject.

Cont….
 In the given table, non-prime attribute TEACHER_AGE is dependent on
TEACHER_ID which is a proper subset of a candidate key. That's why it violates
the rule for 2NF.
 To convert the given table into 2NF, we decompose it into two tables:

Third Normal Form(3NF)
 A relation will be in 3NF if it is in 2NF and not contain any transitive partial
dependency.
 3NF is used to reduce the data duplication. It is also used to achieve the data
integrity.
 If there is no transitive dependency for non-prime attributes, then the
relation must be in third normal form.
 A relation is in third normal form if it holds atleast one of the following
conditions for every non-trivial function dependency X → Y.
 X is a super key.
 Y is a prime attribute, i.e., each element of Y is part of some candidate key.

Cont….
 A relation is in third normal form, if there is no transitive dependency for
non-prime attributes as well as it is in second normal form.
 The normalization of 2NF relations to 3NF involves the removal of transitive
dependencies. If a transitive dependency exists, we remove the transitively
dependent attribute(s) from the relation by placing the attribute(s) in a new
relation along with a copy of the determinant.
 Consider the examples given below.
 Example-1:
In relation STUDENT

BCNF
 BCNF is the advance version of 3NF. It is stricter than 3NF.
 A table is in BCNF if every functional dependency X → Y, X is the super key of
the table.
 For BCNF, the table should be in 3NF, and for every FD, LHS is super key.

Rules for BCNF
 Rule 1: The table should be in the 3rd Normal Form.
 Rule 2: X should be a superkey for every functional dependency (FD) X−>Y in a
given relation.

Let us consider the student database, in
which data of the student are mentioned.

Cont….
 Functional Dependency of the above is as mentioned:
 Stu_ID −> Stu_Branch
 Stu_Course −> {Branch_Number, Stu_Course_No}
 Candidate Keys of the above table are: {Stu_ID, Stu_Course}

Why this Table is Not in BCNF?
 The table present above is not in BCNF, because as we can see that neither
Stu_ID nor Stu_Course is a Super Key. As the rules mentioned above clearly
tell that for a table to be in BCNF, it must follow the property that for
functional dependency X−>Y, X must be in Super Key and here this property
fails, that’s why this table is not in BCNF.
 How to Satisfy BCNF?
 For satisfying this table in BCNF, we have to decompose it into further tables.
Here is the full procedure through which we transform this table into BCNF.
Let us first divide this main table into two
 tables Stu_Branch and Stu_Course Table.

Candidate Key for this table: {Stu_ID, Stu_Course_No}.
After decomposing into further tables, now it is in BCNF,
as it is passing the condition of Super Key, that in
functional dependency X−>Y, X is a Super Key.

Normalization using multi-valued
dependencies(4NF,5NF).
 If two or more independent relations are kept in a single relation or we can
say multivalue dependency occurs when the presence of one or more rows in
a table implies the presence of one or more other rows in that same table.
Put another way, two attributes (or columns) in a table are independent of
one another, but both depend on a third attribute. A multivalued
dependency always requires at least three attributes because it consists of at
least two attributes that are dependent on a third.
 For a dependency A -> B, if for a single value of A, multiple values of B exist,
then the table may have a multi-valued dependency. The table should have at
least 3 attributes and B and C should be independent for A ->> B multivalued
dependency.

This is read as “person multi determines mobile” and “person multi
determines food_likes.”
Note that a functional dependency is a special case of multivalued
dependency. In a functional dependency X -> Y, every x determines exactly
one y, never more than one.

Fourth Normal Form (4NF)
 The Fourth Normal Form (4NF) is a level of database normalization where
there are no non-trivial multivalued dependencies other than a candidate key.
It builds on the first three normal forms (1NF, 2NF, and 3NF) and the Boyce-
Codd Normal Form (BCNF). It states that, in addition to a database meeting
the requirements of BCNF, it must not contain more than one multivalued
dependency.
 Properties
 A relation R is in 4NF if and only if the following conditions are satisfied:
 1. It should be in the Boyce-Codd Normal Form (BCNF).
 2. The table should not have any Multi-valued Dependency.

Cont….
 A table with a multivalued dependency violates the normalization standard of
the Fourth Normal Form (4NF) because it creates unnecessary redundancies
and can contribute to inconsistent data.
 To bring this up to 4NF, it is necessary to break this information into two
tables.
 Example: Consider the database table of a class that has two relations R1
contains student ID(SID) and student name (SNAME) and R2 contains course
id(CID) and course name (CNAME).

Cont….
When their cross-product is done it resulted in
multivalued dependencies.
Multivalued dependencies (MVD) are:
SID->->CID; SID->->CNAME; SNAME->->CNAME

Joint Dependency
 Join decomposition is a further generalization of Multivalued dependencies.
 If the join of R1 and R2 over C is equal to relation R then we can say that a
join dependency (JD) exists, where R1 and R2 are the decomposition R1(A, B,
C) and R2(C, D) of a given relations R (A, B, C, D).
 Alternatively, R1 and R2 are a lossless decomposition of R.
 A JD ⋈ {R1, R2, …, Rn} is said to hold over a relation R if R1, R2, ….., Rn is a
lossless-join decomposition.

Fifth Normal Form / Projected Normal
Form (5NF)
 A relation R is in Fifth Normal Form if and only if everyone joins dependency
in R is implied by the candidate keys of R. A relation decomposed into two
relations must have lossless join Property, which ensures that no spurious or
extra tuples are generated when relations are reunited through a natural
join.
 Properties
 A relation R is in 5NF if and only if it satisfies the following conditions:
 1. R should be already in 4NF.
 2. It cannot be further non loss decomposed (join dependency).

DBMS (UNIT 2)

More Related Content

What's hot

Similar to DBMS (UNIT 2)

More from Dr. SURBHI SAROHA

Recently uploaded

DBMS (UNIT 2)