Lecture_W9_Normalization.ppt

©Silberschatz, Korth and Sudarshan
7.1
Database System Concepts
Relational Database Design

7.2
Relational Database Design
 First Normal Form
 Pitfalls in Relational Database Design
 Functional Dependencies
 Decomposition
 Boyce-Codd Normal Form
 Third Normal Form
 Multivalued Dependencies and Fourth Normal Form
 Overall Database Design Process

7.3
First Normal Form
 Domain is atomic if its elements are considered to be indivisible units
 Examples of non-atomic domains:
 address, composite attributes with component attribute street and city
 dependent_name of an employee in a Banking enterprise is a multivalued
attribute
 A relational schema R is in first normal form (1NF) if the domains of all
attributes of R are atomic
 Non-atomic values complicate storage and encourage redundant
(repeated) storage of data
 E.g. Set of accounts stored with each customer, and set of owners stored
with each account

7.4
First Normal Form (Contd.)
 Atomicity is actually a property of how the elements of the
domain are used.
 E.g. Strings would normally be considered indivisible
 Suppose that students are given roll numbers which are strings of
the form CS0012 or EE1127
 If the first two characters are extracted to find the department, the
domain of roll numbers is not atomic.
 Doing so is a bad idea: leads to encoding of information in
application program rather than in the database.

7.5
Pitfalls in Relational Database Design
 Relational database design requires that we find a
“good” collection of relation schemas. A bad design
may lead to
 Repetition of Information.
 Inability to represent certain information.
 Design Goals:
 Avoid redundant data
 Ensure that relationships among attributes are
represented
 Facilitate the checking of updates for violation of
database integrity constraints.

7.6
Example
 Consider the relation schema:
Lending-schema = (branch-name, branch-city, assets,
customer-name, loan-number, amount)
 Redundancy:
 Data for branch-name, branch-city, assets are repeated for each loan that a
branch makes
 Wastes space
 Complicates updating, introducing possibility of inconsistency of assets value
 Null values
 Cannot store information about a branch if no loans exist
 Can use null values, but they are difficult to handle.

7.7
Decomposition
 Decompose the relation schema Lending-schema into:
Branch-schema = (branch-name, branch-city,assets)
Loan-info-schema = (customer-name, loan-number,
branch-name, amount)
 All attributes of an original schema (R) must appear in
the decomposition (R1, R2):
R = R1  R2
 Lossless-join decomposition.
For all possible relations r on schema R
r = R1 (r) R2 (r)

7.8
Example of Non Lossless-Join Decomposition
 Decomposition of R = (A, B)
R1 = (A) R2 = (B)
A B



1
2
1
A


B
1
2
r
A(r) B(r)
A (r) B (r)
A B




1
2
1
2

7.9
Lossless join decomposition

7.10

7.11

7.12

7.13

7.14
Goal — Devise a Theory for the Following
 Decide whether a particular relation R is in “good” form.
 In the case that a relation R is not in “good” form, decompose it
into a set of relations {R1, R2, ..., Rn} such that
 each relation is in good form
 the decomposition is a lossless-join decomposition
 Our theory is based on:
 functional dependencies
 multivalued dependencies

7.15
Functional Dependencies
 A functional dependency is a constraint that specifies the
relationship between two sets of attributes where one set can
accurately determine the value of other sets.
 It is denoted as X → Y, where X is a set of attributes that is
capable of determining the value of Y.
 A functional dependency is a generalization of the notion of a
key.

7.16
Functional Dependencies

7.17
Functional Dependencies (Cont.)
 Let R be a relation schema
  R and   R
 The functional dependency
  
holds on R if and only if for any legal relations r(R), whenever any
two tuples t1 and t2 of r agree on the attributes , they also agree
on the attributes . That is,
t1[] = t2 []  t1[ ] = t2 [ ]
 Example: Consider r(A,B) with the following instance of r.
 On this instance, A  B does NOT hold, but B  A does hold.
1 4
1 5
3 7

7.20
 A functional dependency is trivial if it is satisfied by all instances
of a relation
 E.g.
 customer-name, loan-number  customer-name
 customer-name  customer-name
 In general,    is trivial if   

7.21
Closure of a Set of Functional
Dependencies
 Given a set F set of functional dependencies, there are certain
other functional dependencies that are logically implied by F.
 E.g. If A  B and B  C, then we can infer that A  C
 The set of all functional dependencies logically implied by F is the
closure of F.
 We denote the closure of F by F+.
 We can find all of F+ by applying Armstrong’s Axioms:
 if   , then    (reflexivity)
 if   , then      (augmentation)
 if   , and   , then    (transitivity)
 These rules are
 sound (generate only functional dependencies that actually hold) and
 complete (generate all functional dependencies that hold).

7.22

7.23
Closure of Functional Dependencies
(Cont.)
 We can further simplify manual computation of F+ by using
the following additional rules.
 If    holds and    holds, then     holds (union)
 If     holds, then    holds and    holds
(decomposition)
 If    holds and     holds, then     holds
(pseudotransitivity)
The above rules can be inferred from Armstrong’s axioms.

7.24
Example
 R = (A, B, C, G, H, I)
F = { A  B
A  C
CG  H
CG  I
B  H}
 some members of F+
 A  H
 by transitivity from A  B and B  H
 AG  I
 by augmenting A  C with G, to get AG  CG
and then transitivity with CG  I
 CG  HI
 from CG  H and CG  I : “union rule” can be inferred from
– definition of functional dependencies, or
– Augmentation of CG  I to infer CG  CGI, augmentation of
CG  H to infer CGI  HI, and then transitivity

7.26
Closure of Attribute Sets
 The closure of a set of attributes X is the set of those attributes
that can be functionally determined from X. The closure of X
is denoted as X+. When given a closure problem, you'll have a
set of functional dependencies over which to compute the
closure and the set X for which to find the closure.
 Finding the closure of a set of attributes is needed for problems
related to normalization. For example, you need to know how
to compute the closure of a set of attributes to check if a set of
attributes is a candidate key or a superkey. You also need this
algorithm to decompose non-normal tables into normal forms.

7.27
Closure of Attribute Sets
 Algorithm to compute +, the closure of  under F
result := ;
while (changes to result) do
for each    in F do
begin
if   result then result := result  
end

7.28
Example of Attribute Set Closure
 R = (A, B, C, G, H, I)
 F = {A  B
A  C
CG  H
CG  I
B  H}
 (AG)+
1. result = AG
2. result = ABCG (A  C and A  B)
3. result = ABCGH (CG  H and CG  AGBC)
4. result = ABCGHI (CG  I and CG  AGBCH)

7.29

7.30

7.31

7.32

7.33

7.34
Uses of Attribute Closure
There are several uses of the attribute closure algorithm:
 Testing for superkey:
 To test if  is a superkey, we compute +, and check if + contains
all attributes of R.
 Testing functional dependencies
 To check if a functional dependency    holds (or, in other words,
is in F+), just check if   +.
 That is, we compute + by using attribute closure, and then check if
it contains .
 Is a simple and cheap test, and very useful
 Computing closure of F
 For each   R, we find the closure +, and for each S  +, we
output a functional dependency   S.

Lecture_W9_Normalization.ppt

Recommended

Recommended

More Related Content

Similar to Lecture_W9_Normalization.ppt

Similar to Lecture_W9_Normalization.ppt (20)

Recently uploaded

Recently uploaded (20)

Lecture_W9_Normalization.ppt