Crack-Smoking Data Models  Data Models  Theory Meets Reality
Rules of Normalization (Simplified) (Simplified) <ul><li>Make sure every table has a primary key (best to have a single va...
Rules of Normalization (Simplified) (Simplified) <ul><li>ORNF - The data model contains only elemental facts </li></ul><ul...
Rules of Normalization (Simplified) (Simplified) <ul><li>JBNF - Don’t Smoke Crack While Doing Data Models! </li></ul><ul><...
The Goal <ul><li>Normalized Data Models </li></ul><ul><ul><li>Everything can be managed via standard operations </li></ul>...
The Reality <ul><li>No system can contain all data points </li></ul><ul><ul><li>This by itself leads to conflicts with the...
The Reality <ul><li>System performance demands redundant data </li></ul><ul><ul><li>Many summaries are too complicated to ...
The Reality <ul><li>Some data is best stored de-normalized </li></ul><ul><ul><li>Management Issues- Do we  really  need a ...
The Reality <ul><li>Some data models  look  whacked, because the data they are modeling  is  whacked. </li></ul><ul><li>In...
Case Study - Homebuilder <ul><li>Builder has several divisions, each division is responsible for an area (kind of - some a...
JB’s Rules of Practicalization <ul><li>1PF - Design a well-normalized database (the level is up to you) which describes th...
JB’s Rules of Practicalization <ul><li>3PF - Determine which business rules are too fungible to be implemented well by the...
JB’s Rules for Practicalization <ul><li>5PF - Rejigger your data model so that it matches your development platform nicely...
Upcoming SlideShare
Loading in …5
×

Crack Smoking Data Models

5,850 views

Published on

A quick introduction to the theory and practice of database design in the real world.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
5,850
On SlideShare
0
From Embeds
0
Number of Embeds
27
Actions
Shares
0
Downloads
56
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Crack Smoking Data Models

  1. 1. Crack-Smoking Data Models Data Models Theory Meets Reality
  2. 2. Rules of Normalization (Simplified) (Simplified) <ul><li>Make sure every table has a primary key (best to have a single value) </li></ul><ul><li>1NF - Put any repeating groups into their own table </li></ul><ul><ul><li>Don’t have a column with a comma-separated list of values or multiple columns to represent the same value multiple times </li></ul></ul><ul><li>2NF/3NF - Put any information that is not dependent on the primary key in its own table </li></ul><ul><ul><li>If you have a table of employees (id, position, salary, name) and their salary is fully determined by their position, move that into a separate table of position/salary. </li></ul></ul><ul><li>4NF/5NF - Make sure all three-way and above joins are indeed valid for 3-way, and don’t need further separation. </li></ul>
  3. 3. Rules of Normalization (Simplified) (Simplified) <ul><li>ORNF - The data model contains only elemental facts </li></ul><ul><li>DKNF - The data model fully defines all constraints and is free from “update anomalies” (the data model prevents any logical inconsistencies) </li></ul><ul><ul><li>Key - uniquely defines a tuple </li></ul></ul><ul><ul><li>Constraint - rule governing values of attributes </li></ul></ul><ul><ul><li>In DKNF keys fully define tuples, and constraints fully define logical relationships / allowed values, including multi-table constraints </li></ul></ul>
  4. 4. Rules of Normalization (Simplified) (Simplified) <ul><li>JBNF - Don’t Smoke Crack While Doing Data Models! </li></ul><ul><li>JBNF2 - Don’t Do Data Models For Customers Who Smoke Crack! </li></ul><ul><ul><li>Okay, I break this one a lot </li></ul></ul><ul><ul><li>But seriously, most business rules/decisions are not based on whether or not it contributes to a logical data model </li></ul></ul><ul><ul><li>And why should it? </li></ul></ul><ul><ul><li>On the other end, when making a data model, we need to realize the flexibility that businesses require, so they don’t have to re-make the data model after every business decision </li></ul></ul><ul><ul><li>Nor should they have to think too much about the data model when making business decisions </li></ul></ul>
  5. 5. The Goal <ul><li>Normalized Data Models </li></ul><ul><ul><li>Everything can be managed via standard operations </li></ul></ul><ul><li>No Redundant Data / No Calculated Columns </li></ul><ul><ul><li>Strong locality -- don’t have to worry if you forgot to set something </li></ul></ul><ul><li>Strict Constraints Enforced </li></ul><ul><ul><li>Pushes the business logic back to the data model so it can be easily managed outside of the application logic - no “update anomalies” </li></ul></ul><ul><li>Data is modeled as data - fewer text fields </li></ul><ul><ul><li>All data can be managed and verified </li></ul></ul>
  6. 6. The Reality <ul><li>No system can contain all data points </li></ul><ul><ul><li>This by itself leads to conflicts with theory </li></ul></ul><ul><ul><li>Not all information is available </li></ul></ul><ul><ul><li>Summary columns have to be managed which summarizes data inside and outside the database </li></ul></ul><ul><ul><li>Some business rules are too complicated/flexible to be modeled, and must be abbreviated with flags and/or text fields </li></ul></ul>
  7. 7. The Reality <ul><li>System performance demands redundant data </li></ul><ul><ul><li>Many summaries are too complicated to be recalculated each time </li></ul></ul><ul><ul><ul><li>Some of this can be mitigated with functional indices </li></ul></ul></ul><ul><ul><li>Some summaries may need to be altered based on data exterior to the database </li></ul></ul>
  8. 8. The Reality <ul><li>Some data is best stored de-normalized </li></ul><ul><ul><li>Management Issues- Do we really need a table for that? </li></ul></ul><ul><ul><li>Performance Issues- Do we really want to query for that? </li></ul></ul><ul><ul><li>Time Issues - Do we really want to build the interface to manage that? </li></ul></ul><ul><ul><li>In some cases, maybe we can de-normalize our database to save some sanity. </li></ul></ul>
  9. 9. The Reality <ul><li>Some data models look whacked, because the data they are modeling is whacked. </li></ul><ul><li>In an ideal world, we would encourage the customer to come up with a more consistent way to manage themselves. </li></ul><ul><li>But usually we just model what they have because it’s easier than changing 20 years of tradition and infrastructure </li></ul>
  10. 10. Case Study - Homebuilder <ul><li>Builder has several divisions, each division is responsible for an area (kind of - some are on top of each other) </li></ul><ul><li>Builder has several brands </li></ul><ul><li>Builder categorizes houses and communities by lifestyle </li></ul><ul><li>Builder also needs to track home plans and inventory </li></ul>
  11. 11. JB’s Rules of Practicalization <ul><li>1PF - Design a well-normalized database (the level is up to you) which describes their data as you understand it. </li></ul><ul><li>2PF - If they way that the customer talks about their data is inconsistent, develop a vocabulary to use when talking to them about their project which matches the data model. Be sure to clarify any unclear statements they make using the new vocabulary. </li></ul>
  12. 12. JB’s Rules of Practicalization <ul><li>3PF - Determine which business rules are too fungible to be implemented well by the database, and instead make manual processes for dealing with exceptions using flags and text fields </li></ul><ul><li>4PF - Determine if some values may have exceptions based on incomplete information in the database, and create user-maintainable columns or tables for them </li></ul>
  13. 13. JB’s Rules for Practicalization <ul><li>5PF - Rejigger your data model so that it matches your development platform nicely. </li></ul><ul><li>6PF - Create calculated columns based on real or anticipated performance problems. Be sure there are application-level measures taken to keep these mostly consistent. </li></ul><ul><li>7PF - ? </li></ul>

×