Successfully reported this slideshow.

Design of databases


Published on

Published in: Spiritual, Technology
  • Be the first to comment

  • Be the first to like this

Design of databases

  1. 1. Design Of Databases •What is Good Design •Normalization
  2. 2. Pitfalls in Relational-Database Design• Repetition of information• Inability to represent certain information
  3. 3. Repetition of Information• Lending-schema = (branch-name, branch-city, assets, customer-name, loan-number, amount)• t[assets] is the asset figure for the branch named t[branch-name].• t[branch-city] is the city in which the branch named t[branch-name] is located.
  4. 4. Cont…• t[loan-number] is the number assigned to a loan given by the branch named t[branch- name] to the customer named t[customer- name].• t[amount] is the amount of the loan whose number is t[loan-number].
  5. 5. Example of Repetition• (branch-name, branch-city, assets, customer- name, customer-city, loan-number, amount)• (Perryridge, Horseneck, 1700000, Adams, Brooklyn, L-31, 1500)
  6. 6. • Suppose that we wish to add a new loan to our database. Say that the loan is made by the Perryridge branch to Adams in the amount of $1500. Let the loan-number be L-31. In our design, we need a tuple with values on all the attributes of Lendingschema.• Thus, we must repeat the asset and city data for the Perryridge branch and the customer- city.
  7. 7. Branch- Branch- Assets Customer- Customer- Loan- AmountName City name City numberPerryridge Horseneck 1700000 Adams Brooklyn L-31 1500Perryridge Horseneck 1700000 Adams Brooklyn L-32 30000Perryridge Horseneck 1700000 Adams Brooklyn L-33 2500Perryridge Horseneck 1700000 Bob Horseneck L-39 4500Redwood Palo Alto 2100000 Smith Rye L-23 2000Redwood Palo Alto 2100000 Smith Rye L-52 3000
  8. 8. • Repeating information wastes space.• Furthermore, it complicates updating the database.• for example, that the assets of the Perryridge branch change from 1700000 to 1900000.• Each tuple with Branch-Name Perryridge must be updated.
  9. 9. Inability to Represent Information• Another problem with the Lending-schema design is that we cannot represent directly the information concerning a branch (branch- name, branch-city, assets) unless there exists at least one loan at the branch.• One solution to this problem is to introduce null values.
  10. 10. Functional Dependency• We know that a bank branch has a unique value of assets, so given a branch name we can uniquely identify the assets value.• In other words, we say that the functional dependency branch-name → assets holds good.
  11. 11. • The fact that a branch has a particular value of assets, and the fact that a branch makes a loan are independent; these facts are best represented in separate relations (Tables).
  12. 12. Super Key• Let R be a relation schema. A subset K of R is a superkey of R if, in any legal relation r(R), for all pairs• t1 and t2 of tuples in r such that if t1[K] = t2[K], then t1 = t2.• That is, no two tuples in any legal relation r(R) may have the same value on attribute set K.
  13. 13. Back to Functional Dependencies• The notion of functional dependency generalizes the notion of superkey.• Consider a relation schema R, and let α ⊆ R and β ⊆ R. The functional dependency α →β holds on schema R if, in any legal relation r(R), for all pairs of tuples t1 and t2 in r such that if t1[α] = t2[α], it is also the case that t1[β] = t2[β].
  14. 14. • Consider our original Lending-Schema: – Functional dependencies on it are: – Branch Name -> Branch City Branch – Branch Name -> Assets Schema – Loan Number -> Amount Loan – Loan Number -> Branch Name Schema – Loan Number -> Customer Name – Customer Name -> Customer City - Customer Schema
  15. 15. Branch SchemaBranch-Name Branch-City AssetsPerryridge Horseneck 1700000Redwood Palo Alto 2100000
  16. 16. Loan SchemaLoan-number Customer-name Branch-Name AmountL-31 Adams Perryridge 1500L-32 Adams Perryridge 30000L-33 Adams Perryridge 2500L-39 Bob Perryridge 4500L-23 Smith Redwood 2000L-52 Smith Redwood 3000
  17. 17. Customer SchemaCustomer – Name Customer – CityAdam BrooklynBob HorseneckSmith Rye
  18. 18. Closure on Set of Functional Dependencies• Armstrong Rules:• Reflexivity - If α is a set of attributes and β ⊆ α, then α →β holds.• Augmentation rule - If α → β holds and γ is a set of attributes, then γα → γβ holds.• Transitivity rule - If α →β holds and β → γ holds, then α → γ holds.
  19. 19. Rules derived from Armstrong Rules• Union rule. If α → β holds and α → γ holds, then α →βγ holds.• Decomposition rule. If α →βγ holds, then α → β holds and α →γ holds.• Pseudotransitivity rule. If α→β holds and γβ →δ holds, then αγ →δ holds.
  20. 20. Algorithm to compute F+ (F closure)F+ = Frepeat for each functional dependency f in F+ apply reflexivity and augmentation rules on f add the resulting functional dependencies to F+ for each pair of functional dependencies f1 and f2 in F+ if f1 and f2 can be combined using transitivity Add the resulting functional dependency to F+until F+ does not change any further
  21. 21. Properties of Decomposition• Lossless join decomposition• Dependency Preservation• Decrease in Repetition of Information
  22. 22. Boyce–Codd Normal FormA relation schema R is in BCNF with respect to aset F of functional dependencies if, for allfunctional dependencies in F+ of the form α →β, where α ⊆R and β ⊆ R, at least one of thefollowing holds:• α → β is a trivial functional dependency (thatis, β ⊆ α).• α is a superkey for schema R.
  23. 23. • A database design is in BCNF if each member of the set of relation schemas that constitutes the design is in BCNF.• Branch Schema, Loan Schema and Customer Schema make up the BCNF of the Lending- Schema
  24. 24. BCNF Decomposition Algorithmresult := {R};done := false;compute F+;while (not done) do if (there is a schema Ri in result that is not in BCNF) then begin let α → β be a nontrivial functional dependency that holds on Ri such that α → Ri is not in F+, and α ∩ β = ∅ result := (result − Ri) ∪ (Ri − β) ∪ ( α, β) endelse done := true