Published on

Introduction to database lecture# 9: This lecture is all about Database Design Processes and principals

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Functional Dependencies1Fall 2001 Database Systems 1Design Process - Where are we?ConceptualDesignConceptualSchema(ER Model)LogicalDesignLogical Schema(Relational Model)Step 1: ER-to-RelationalMappingStep 2: Normalization:“Improving” the designFall 2001 Database Systems 2• Relations should have semantic unity• Information repetition should be avoided– Anomalies: insertion, deletion, modification• Avoid null values as much as possible– Difficulties with interpretation• don’t know, don’t care, known but unavailable, does notapply– Specification of joins• Avoid spurious joinsRelational Design Principles
  2. 2. Functional Dependencies2Fall 2001 Database Systems 3Normalization• A bad database design may suffer from anomalies thatmake the database difficult to use:COMPANIES(company_name, company_address,date_founded, owner_name,owner_title, #shares )• Anomalies:– update anomaly occurs if changing the value of an attributeleads to an inconsistent database state.– insertion anomaly occurs if we cannot insert a tuple due to somedesign flaw.– deletion anomaly occurs if deleting a tuple results in unexpectedloss of information.• Normalization is the systematic process for removing allsuch anomalies in database design.Fall 2001 Database Systems 4Update Anomaly• If a company has three owners, there are three tuples inthe COMPANIES relation for this company• If this company moves to a new location, the company’saddress must be updated consistently in all three tuples– updating the company address in just one or two of the tuplescreates an inconsistent database state• It would be better if the company name and addresswere in a separate relation so that the address of eachcompany appears in only one tupleCOMPANIES(company_name, company_address,date_founded, owner_name,owner_title, #shares )
  3. 3. Functional Dependencies3Fall 2001 Database Systems 5Insert Anomaly• Suppose that three people have just created a newcompany:– the three founders have no titles yet– stock distributions have yet to be defined• The new company cannot be added to the COMPANIESrelation because there is not enough information to fill inall the attributes of a tuple– at best, null values can be used to complete a tuple• It would be better if owner and stock information wasstored in a different relationCOMPANIES(company_name, company_address,date_founded, owner_name,owner_title, #shares )Fall 2001 Database Systems 6Delete Anomaly• Suppose that an owner of a company retires so is nolonger an owner but retains stock in the company• If this person’s tuple is deleted from the COMPANIESrelation, then we lose the information about how muchstock the person still owns• If the stock information was stored in a different relation,then we can retain this information after the person isdeleted as an owner of the companyCOMPANIES(company_name, company_address,date_founded, owner_name,owner_title, #shares )
  4. 4. Functional Dependencies4Fall 2001 Database Systems 7What to do?• Take each relation individually and “improve” it in termsof the desired characteristics– Normal forms• Atomic values (1NF)• Can be defined according to keys and dependencies.• Functional Dependencies ( 2NF, 3NF, BCNF)• Multivalued dependencies (4NF)– Normalization• Normalization is a process of concept separation which applies atop-down methodology for producing a schema by subsequentrefinements and decompositions.• Do not combine unrelated sets of facts in one table; each relationshould contain an independent set of facts.• Universal relation assumption• 1NF to 3NF; 1NF to BCNFFall 2001 Database Systems 8Normalization Issues• How do we decompose a schema into a desirable normal form?• What criteria should the decomposed schemas follow in order topreserve the semantics of the original schema?– Reconstructability: recover the original relation no spurious joins– Lossless decomposition: no information loss– Dependency preservation: the constraints (i.e., dependencies) that holdon the original relation should be enforceable by means of theconstraints (i.e., dependencies) defined on the decomposed relations.• What happens to queries?– Processing time may increase due to joins– Denormalization
  5. 5. Functional Dependencies5Fall 2001 Database Systems 9Keys in the relational model• Superkey– A set of one or more attributes, which, taken collectively, allowus to identify uniquely a tuple in a relation.– Let R be a relation scheme. A subset K of R is a superkey of R if,in any legal relation [instance] r of R, for all pairs t1 and t2 oftuples in r such that t1[K] = t2[K] t1 = t2.• Candidate key– A superkey for which no proper subset is a superkey.• Primary key– The candidate key that is chosen by the database designer asthe principle key.Fall 2001 Database Systems 10Functional Dependencies• A functional dependency is a constraintbetween two sets of attributes in a relationaldatabase.• If X and Y are two sets of attributes in the samerelation T, then X → Y means that X functionallydetermines Y so that– the values of the attributes in X uniquely determinethe values of the attributes in Y– for any two tuples t1 and t2 in T, t1[X] = t2[X] impliesthat t1[Y] = t2[Y]– if two tuples in T agree in their X column(s), then theirY column(s) should also be the same.
  6. 6. Functional Dependencies6Fall 2001 Database Systems 11Functional Dependencies• Dependencies forthis relation:– A → B– A → D– B,C → E,F• Do they all hold inthis instance ofthe relation R?  ¡ ¢ £ ¤ ¥ ¦§©¨ ¨ ¨ ¨ ¨ ¨§©¨ ¨ ¨ !§© ¨ !§ ! §© ¨ ! ! #§$ ¨ ¨ % ¨ ¨• Functional dependencies are specified by the databaseprogrammer based on the intended meaning of theattributes.Fall 2001 Database Systems 12Functional Dependencies• What are the functional dependencies in:COMPANIES(company_name, company_address,date_founded, owner_name,owner_title, #shares )company_name → company_addresscompany_name → date_foundedcompany_name, owner_id → owner_titlecompany_name, owner_id → #sharescompany_name, owner_title → owner_idowner_id → owner_name
  7. 7. Functional Dependencies7Fall 2001 Database Systems 13Armstrong’s Axioms• Armstrong’s Axioms: Let X, Y be sets of attributesfrom a relation T.[1] Inclusion rule: If Y⊆ X, then X → Y.[2] Transitivity rule: If X → Y, and Y → Z, then X → Z.[3] Augmentation rule: If X → Y, then XZ → YZ.• Other derived rules:[1] Union rule: If X → Y and X → Z, then X → YZ[2] Decomposition rule: If X → YZ, then X → Y and X → Z[3] Pseudotransitivity: If X → Y and WY → Z, then XW → Z[4] Accumulation rule: If X → YZ and Z → BW,then X → YZBFall 2001 Database Systems 14Closure• Let F be a set of functional dependencies.• We say F implies a functional dependency g if gcan be derived from the FDs in F using theaxioms.• The closure of F, written as F+, is the set of allFDs that are implied by F.• Example: Given FDs { A → BC, C → D, AD → B },what is the closure?The closure is:{ A → BC, C → D, AD → B, A → D }
  8. 8. Functional Dependencies8Fall 2001 Database Systems 15Cover and Minimal Cover• A set F of functional dependencies is said to coveranother set G of function dependencies iff G ⊆ F+.– in other words, all dependencies in G can be derived from thedependencies in F.– if both F covers G, and G covers F then F and G are said to beequivalent, written as F ≡ G.– Given F = { A → BC, C → D, AD → B }G = {A → B, C → D }Does F cover G? Does G cover F?• A minimal cover for a set F of functional dependencies isa minimal set M that covers F.Fall 2001 Database Systems 16Closure of a set of attributes• Given a set of F of functional dependencies, theclosure of a set of attributes X, denoted by X+ isthe largest set of attributes Y such that X → Y.Algorithm Closure(X,F)X[0] = X; I = 0;repeatI = I + 1;X[I] = X[I-1];FOR ALL Z → W in FIF Z ⊆ X[I] THEN X[I] = X[I] ∪ WEND FORuntil X[I] = X[I-1];RETURN X+ = X[I]
  9. 9. Functional Dependencies9Fall 2001 Database Systems 17Minimal Cover Algorithm• Given a set F of functional dependencies,find a minimal cover M for F.– Step 1. Decompose all FDs of the form X →a1…an to a set of functional dependencies witha single attribute on the right side, i.e. X → a1, …, X → an– Step 2. Remove all inessential FDs from Ffor all X → A in Ffind H = F - { X → A }If A ⊆ X+ in H, then remove X → Aend forFall 2001 Database Systems 18– Step 3. For all FDs, try to remove attributes from theleft hand side as long as the result does not changeF+.for all X → A in Flet Y = X - {b} for some attribute blet G = (F - {X → A}) ∪ {Y → A}if Y+ under G is equal to Y+ under F thenset F = G (I.e. remove b from the givenFD)end for– Step 4. Gather all FDs with the same left hand sideusing the union rule
  10. 10. Functional Dependencies10Fall 2001 Database Systems 19Minimal Cover Exercise• Compute the minimal cover of the following setof functional dependencies:{ ABC → DE, BD → DE, E → CF, EG → F }The minimal cover is:{ ABC → D, BD → E, E → CF }Fall 2001 Database Systems 20Lossless Decomposition• Given a table T, a lossless decomposition of Tinko k tables is a set of tables {T1,T2,…,Tk} suchthat– head(T) = head(T1) ∪ head(T2) ∪ … ∪ head(Tk)  i.e., the decomposed tables contain all attributes from theoriginal table– each Ti contains the table T projected onto columnshead(Ti)– lossless condition: T = T1¡£¢ T2¡£¢ …¡£¢ Tk  i.e., the decomposed tables will contain the sameinformation as T, no more and no less
  11. 11. Functional Dependencies11Fall 2001 Database Systems 21  ¡ ¢ £ ¤¥§¦ ¨©¦ ¦ ¦¥§¦ ¨©¦ ¦¥§ ¨©¦ ©¥ ¨© ¥ ¨©¦ ¦ ¦! # $% (%) (%§0 ()%21 (354 6 7 89©@ A@ B@9©@ AC B@9©@ AC B©D9©C AD BEF5GIH5PRQTS U V W XY` a§` b` c©`Y` a§` b2d c©`Y` a§` b2d ceYd a§` b` c©`Yd a§` b2d c©`Yd a§` b2d ceYe ad b2e cfYgf a§` b` c©`Ygf a§` b2d c©`Ygf a§` b2d ce{R1, R2} is not alossless decompostionof relation R.The first and secondparts of the definitionare satisfied, but notthe third part.Fall 2001 Database Systems 22Lossless Decompositions• Given a table T and a set of attributes X ⊆ head(T) thefollowing statements are equivalent:– X is a superkey of T– X → head(T)– X functionally determines all attributes in T– X+ = head(T)• Given a table T with a set F of functional dependenciesvalid on T, a decomposition of T into tables {T1,T2} is alossless decomposition if one of the followingdecompositions is implied by F:(1) head(T1) ∩ head(T2) → head(T1)(2) head(T1) ∩ head(T2) → head(T2)i.e., the attributes in common between the two tablesdetermine all attributes in one of the new tables
  12. 12. Functional Dependencies12Fall 2001 Database Systems 23Lossless Decompositions• Let head(T) = {A,B,C,D,E,F} with functionaldependencies:– AB → CD– AE → D– C → F• Which of the following are lossless decompositions?head(T1) = {A, B, C} head(T2) = {C, D, E, F}head(T1) = {A, B, C, F} head(T2) = {A, B, D, E}{A, B, C} ∩ {C, D, E, F} = {C}But, {C} → {A, B, C} and {C} → {C, D, E, F}{A, B, C, F} ∩ {A, B, D, E} = {A, B}{A, B} → {A, B, C} = head(T1), so this is losslessFall 2001 Database Systems 24Preserving Functional Dependencies• Given a table T and a set of functionaldependencies F, let {T1,T2,…,Tk} be adecomposition of T– A functional dependency X → Y in F is said to bepreserved in this decomposition if there exists a table Tisuch that X ∪ Y ⊆ head(Ti).– In this case, we say that X → Y lies in Ti.• A decomposition is functional dependencypreserving if all dependencies in (the minimalcover of) F are preserved.
  13. 13. Functional Dependencies13Fall 2001 Database Systems 25Preserving Functional Dependencies• Given {AB → CD, AE → D, C → F}, which dependenciesare preserved by the following decompositions?head(T1) = {A, B, C} head(T2) = {C, D, E, F}head(T1) = {A, B, C, F} head(T2) = {A, B, D, E}AB → C C → FNot functional dependency preservingAB → C AB → DC → F AE → DFunctional dependency preservingFall 2001 Database Systems 26Boyce-Codd Normal Form• A table T is said to be in Boyce-Codd Normal Form(BCNF) with respect to a given set of functionaldependencies F if for all functional dependencies of theform X → A implied by F the following is true:If A is a single attribute that is not in X then X is a superkey.Alternately,If A is a single attribute that is not in X then X contains allthe attributes in a key.• Given {AB → C, AB → D, AE → D, C → F} with Key:{A,B,E}– not in BCNF since C is a single attribute not in AB, but AB is nota superkey.
  14. 14. Functional Dependencies14Fall 2001 Database Systems 27Boyce-Codd Normal FormGiven head(T)={A,B,C,D,E,F} with functionaldependencies{AC → D, AC → E, AF → B, AD → F, BC → A, ABC →F } and keys: {A, C}, {B, C}, is this relation T inBCNF?No. It is sufficient to find one violation!– AF → B violates BCNF since B is not in AFand AF is not a superkey.– AD → F violates BCNF since F is not in ADand AD is not a superkey.Note: ABC → F does not violate BCNF sinceABC is a superkey.Fall 2001 Database Systems 28Normalization• Given a table T that is not in BCNF, we want to find alossless and dependency preserving decomposition for Tsuch that all sub-tables are in BCNFCompany( company_name, company_address,date_founded, owner_id, owner_name,owner_title, #shares )Functional dependenciescompany_name → company_address, date_foundedcompany_name, owner_id → owner_title, #sharescompany_name, owner_title → owner_idowner_id → owner_nameKeys: {company_name, owner_id}{company_name, owner_title}Is it in BCNF?
  15. 15. Functional Dependencies15Fall 2001 Database Systems 29NormalizationCompany1(company_name, company_address,date_founded, owner_id, owner_title, #shares)company_name → company_address, date_foundedcompany_name, owner_id → owner_title, #sharescompany_name, owner_title → owner_idCompany2(owner_id, owner_name)owner_id → owner_nameFurther decompose Company1:Company11(company_name, company_address, date_founded)Company12(company_name, owner_id, owner_title, #shares)Are all the resulting relations in BCNF?Fall 2001 Database Systems 30BCNF• It is not always possible to find a lossless anddependency preserving decomposition of a relation thatis in BCNF.• Given: Address(Street, City, State, Zip)Street, City, State → ZipZip → City, StateKeys: {Street, City, State} {Street, Zip}It is not in BCNF since Zip → City, but {Zip} is not asuperkey.But, we cannot decompose further, since if we haveAddress1(Street, City, State)Address2(City, Zip)we loose the functional dependency Street, City, State→ Zip• It is not always desirable to normalize relations too far!
  16. 16. Functional Dependencies16Fall 2001 Database Systems 31Third Normal Form• A table T is said to be in third normal form (3NF) withrespect to a given set of functional dependencies F if forall functional dependencies of the form X → A implied byF, if A is a single attribute that is not in X, then one of thefollowing holds:– X is a superkey for T– the attribute A is in one of the keys for T• If a table is BCNF, then it is in third normal form. But,there are relations that are in 3NF that are not in BCNF.– Example: The Address relation is 3NF, but not BCNF.Address(Street, City, State, Zip)Street, City, State → ZipZip → City, StateFall 2001 Database Systems 32Algorithm for 3NF DecompositionGiven a table T and a set of functional dependencies Freplace F with minimal cover of FS = { }for all X → Y in Fif no table Z in S contains all attributes in X ∪ Y thenS = S ∪ head(X ∪ Y) /* create a table containingattributes in X ∪ Y */end-forfor all candidate keys K for Tif no table Z in S contains all attributes in K thenS = S ∪ head(K) /* create a new table withattributes in K*/end for
  17. 17. Functional Dependencies17Fall 2001 Database Systems 33Exercise• Given relation T, with head(T)={A,B,C,D,E,F} andfunctional dependencies {AE → DF, BE → F, B → A, AD→ C}, compute a decomposition that is in 3NF.A 3NF decomposition is:head(T1) = {A, E, D}head(T2) = {A, E, F}head(T3) = {B, A}head(T4) = {A, D, C}head(T5) = {B, E}