Your SlideShare is downloading. ×
Crack Smoking Data Models
Crack Smoking Data Models
Crack Smoking Data Models
Crack Smoking Data Models
Crack Smoking Data Models
Crack Smoking Data Models
Crack Smoking Data Models
Crack Smoking Data Models
Crack Smoking Data Models
Crack Smoking Data Models
Crack Smoking Data Models
Crack Smoking Data Models
Crack Smoking Data Models
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Crack Smoking Data Models


Published on

A quick introduction to the theory and practice of database design in the real world.

A quick introduction to the theory and practice of database design in the real world.

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Crack-Smoking Data Models Data Models Theory Meets Reality
  • 2. Rules of Normalization (Simplified) (Simplified)
    • Make sure every table has a primary key (best to have a single value)
    • 1NF - Put any repeating groups into their own table
      • Don’t have a column with a comma-separated list of values or multiple columns to represent the same value multiple times
    • 2NF/3NF - Put any information that is not dependent on the primary key in its own table
      • If you have a table of employees (id, position, salary, name) and their salary is fully determined by their position, move that into a separate table of position/salary.
    • 4NF/5NF - Make sure all three-way and above joins are indeed valid for 3-way, and don’t need further separation.
  • 3. Rules of Normalization (Simplified) (Simplified)
    • ORNF - The data model contains only elemental facts
    • DKNF - The data model fully defines all constraints and is free from “update anomalies” (the data model prevents any logical inconsistencies)
      • Key - uniquely defines a tuple
      • Constraint - rule governing values of attributes
      • In DKNF keys fully define tuples, and constraints fully define logical relationships / allowed values, including multi-table constraints
  • 4. Rules of Normalization (Simplified) (Simplified)
    • JBNF - Don’t Smoke Crack While Doing Data Models!
    • JBNF2 - Don’t Do Data Models For Customers Who Smoke Crack!
      • Okay, I break this one a lot
      • But seriously, most business rules/decisions are not based on whether or not it contributes to a logical data model
      • And why should it?
      • On the other end, when making a data model, we need to realize the flexibility that businesses require, so they don’t have to re-make the data model after every business decision
      • Nor should they have to think too much about the data model when making business decisions
  • 5. The Goal
    • Normalized Data Models
      • Everything can be managed via standard operations
    • No Redundant Data / No Calculated Columns
      • Strong locality -- don’t have to worry if you forgot to set something
    • Strict Constraints Enforced
      • Pushes the business logic back to the data model so it can be easily managed outside of the application logic - no “update anomalies”
    • Data is modeled as data - fewer text fields
      • All data can be managed and verified
  • 6. The Reality
    • No system can contain all data points
      • This by itself leads to conflicts with theory
      • Not all information is available
      • Summary columns have to be managed which summarizes data inside and outside the database
      • Some business rules are too complicated/flexible to be modeled, and must be abbreviated with flags and/or text fields
  • 7. The Reality
    • System performance demands redundant data
      • Many summaries are too complicated to be recalculated each time
        • Some of this can be mitigated with functional indices
      • Some summaries may need to be altered based on data exterior to the database
  • 8. The Reality
    • Some data is best stored de-normalized
      • Management Issues- Do we really need a table for that?
      • Performance Issues- Do we really want to query for that?
      • Time Issues - Do we really want to build the interface to manage that?
      • In some cases, maybe we can de-normalize our database to save some sanity.
  • 9. The Reality
    • Some data models look whacked, because the data they are modeling is whacked.
    • In an ideal world, we would encourage the customer to come up with a more consistent way to manage themselves.
    • But usually we just model what they have because it’s easier than changing 20 years of tradition and infrastructure
  • 10. Case Study - Homebuilder
    • Builder has several divisions, each division is responsible for an area (kind of - some are on top of each other)
    • Builder has several brands
    • Builder categorizes houses and communities by lifestyle
    • Builder also needs to track home plans and inventory
  • 11. JB’s Rules of Practicalization
    • 1PF - Design a well-normalized database (the level is up to you) which describes their data as you understand it.
    • 2PF - If they way that the customer talks about their data is inconsistent, develop a vocabulary to use when talking to them about their project which matches the data model. Be sure to clarify any unclear statements they make using the new vocabulary.
  • 12. JB’s Rules of Practicalization
    • 3PF - Determine which business rules are too fungible to be implemented well by the database, and instead make manual processes for dealing with exceptions using flags and text fields
    • 4PF - Determine if some values may have exceptions based on incomplete information in the database, and create user-maintainable columns or tables for them
  • 13. JB’s Rules for Practicalization
    • 5PF - Rejigger your data model so that it matches your development platform nicely.
    • 6PF - Create calculated columns based on real or anticipated performance problems. Be sure there are application-level measures taken to keep these mostly consistent.
    • 7PF - ?