Your SlideShare is downloading. ×

How to Fake a Database Design

1,544

Published on

In evaluating developers, I routinely come across very talented developers with a decade or more of experience with databases who nonetheless can't design even the simplest of schemas. This …

In evaluating developers, I routinely come across very talented developers with a decade or more of experience with databases who nonetheless can't design even the simplest of schemas. This presentation is based on my popular blog post of the same name: http://blogs.perl.org/users/ovid/2013/07/how-to-fake-database-design.html

0 Comments
11 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,544
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
75
Comments
0
Likes
11
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Duplicate data means identity, not equality!
  • Any guesses as to what was in ingredient8?
  • Note that ‘address’ and ‘directions’ aren’t separate tables. Great point for discussion. (Surprêmes de volaille aux champignons === chicken parisienne)
  • FKs prevent crap data.
    How many of you have worked on databases with crap data?
    Well-designed databases can make it hard to add crap data.
  • Even if you *knew* you would never need more than 8 ingredients,
    what do you do when you find out that macaroni, barbecue, or fettucinne
    are routinely misspelled?
  • Transcript

    • 1. How to Fake a Database Design How do I spell “normalization”? OSCON 2014 Curtis "Ovid" Poe http://allaroundtheworld.fr/ Copyright 2014, http://www.allaroundtheworld.fr/June 5, 2014
    • 2. Good Database Schemas • Generally normalized • Denormalized only as necessary • No duplicate data June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 3. Typical Developer Schemas • A steaming pile of ones and zeros • … with a “family friendly” background June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ Source: http://commons.wikimedia.org/wiki/File:Spaghetti-prepared.jpg
    • 4. Database Normalization • Remove redundancy • Create logical relations • Decomposing data to atomic elements June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 5. Only Covering 3NF 1. Remove repeating groups of data 2. Remove partial key dependencies 3. Remove data unrelated to key June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 6. How to Feel Stupid June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ “It is shown that if a relation schema is in third normal form and every key is simple, then it is in projection-join normal form (sometimes called fifth normal form), the ultimate normal form with respect to projections and joins.” Simple Conditions for Guaranteeing Higher Normal Forms in Relational Databases — C. J. Date http://commons.wikimedia.org/wiki/File:%22I_should_have_gone_to_the_pro_station%22_-_NARA_- _514564.tif
    • 7. ‘Nuff of that – Let’s Get Started June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ I’m going to discuss “how”, not “why”, because I only have 50 minutes.
    • 8. Faking a Database Design • Forget everything you know about Excel • Focus on nouns (sort of) • Duplicate data is a design flaw June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 9. Real-World Problem • Client wanted a rewrite of recipes site • They sent us their Access (!) database • Main objects: – customers – recipes – orders June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 10. Our “DBA” Said This Was OK June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 11. June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ Our “DBA” also lost his job shortly thereafter
    • 12. Back to the plot … • Customers • Orders • Recipes June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 13. Nouns == Tables(*) June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 14. Nouns == Tables(*) June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 15. Rule #1 1. Nouns == tables June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 16. What’s with the customer_id? June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 17. It’s a foreign key June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ One-to-many relationship
    • 18. Our DDL (Data Definition Language) June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ CREATE TABLE orders ( order_id SERIAL PRIMARY KEY, customer_id INTEGER NOT NULL, order_date TIMESTAMP WITH TIME ZONE NOT NULL, FOREIGN KEY (customer_id) REFERENCES customer(customer_id) );
    • 19. Rule #2 1. Nouns == tables 2. Another table’s ID must have a FK constraint June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 20. Oh dog, no! June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 21. But “What if”? 1. fettuccinne 2. fettuchini 3. fettucini 4. fettucinne 5. fetuchine 6. fetuchinney 7. fetuchinni 8. fetucine 9. fetucini 10. fetucinni June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ https://www.flickr.com/photos/ykjc9/3485366680/sizes/l
    • 22. Searching SELECT recipe_id, name FROM recipes WHERE ingredient1 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney', 'fetuchinni', 'fetucine', 'fetucini', 'fetucinni') OR ingredient2 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney', 'fetuchinni', 'fetucine', 'fetucini', 'fetucinni') OR ingredient3 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney', 'fetuchinni', 'fetucine', 'fetucini', 'fetucinni') OR ingredient4 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney', 'fetuchinni', 'fetucine', 'fetucini', 'fetucinni') OR ingredient5 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney', 'fetuchinni', 'fetucine', 'fetucini', 'fetucinni') OR ingredient6 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney', 'fetuchinni', 'fetucine', 'fetucini', 'fetucinni') OR ingredient7 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney', 'fetuchinni', 'fetucine', 'fetucini', 'fetucinni') OR ingredient8 IN ( 'fettuccinne', 'fettuchini', 'fettucini', 'fettucinne', 'fetuchine', 'fetuchinney', 'fetuchinni', 'fetucine', 'fetucini', 'fetucinni'); June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 23. It’s “fettuccine”, in case you were wondering June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 24. Searching SELECT recipe_id, name FROM recipes WHERE ingredient1 = 'fettuccine' OR ingredient2 = 'fettuccine' OR ingredient3 = 'fettuccine' OR ingredient4 = 'fettuccine' OR ingredient5 = 'fettuccine' OR ingredient6 = 'fettuccine' OR ingredient7 = 'fettuccine' OR ingredient8 = 'fettuccine'; June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 25. Ingredients Table June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 26. Rule #3 1. Nouns == tables 2. Another table’s ID must have a FK constraint 3. Lists of things get their own table June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 27. Lookup Table June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ Many-to-many relationship
    • 28. Searching SELECT recipe_id, name FROM recipes r JOIN recipe_ingredients ri ON ri.recipe_id = r.recipe_id JOIN ingredients i ON i.ingredient_id = ri.ingredient_id WHERE i.name = 'fettuccine'; June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 29. Our DDL (Data Definition Language) June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ CREATE TABLE recipes_ingredients ( recipe_ingredient_id SERIAL PRIMARY KEY, recipe_id INTEGER NOT NULL, ingredient_id INTEGER NOT NULL, UNIQUE(recipe_id, ingredient_id), FOREIGN KEY (recipe_id) REFERENCES recipes(recipe_id), FOREIGN KEY (ingredient_id) REFERENCES ingredients(ingredient_id) );
    • 30. Our DDL (Data Definition Language) June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ CREATE TABLE recipes_ingredients ( recipe_id INTEGER NOT NULL, ingredient_id INTEGER NOT NULL, PRIMARY KEY (recipe_id, ingredient_id), FOREIGN KEY (recipe_id) REFERENCES recipes(recipe_id), FOREIGN KEY (ingredient_id) REFERENCES recipes(ingredient_id) );
    • 31. Rule #4 1. Nouns == tables 2. Another table’s ID must have a FK constraint 3. Lists of things get their own table 4. Many-to-many == lookup table (with FKs) June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 32. So How Do We Order Recipes? June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 33. Orders With Recipes June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 34. How Many of Which Ingredient? June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 35. June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ Our simple “customers”, “orders”, and “recipes” database has grown to seven tables. And it will keep growing.
    • 36. So Far • Every noun has its own table (*) • Lookup tables join related tables • And generally have some of unique constraint • Other table’s ids have foreign key constraints June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 37. Database Tips • We’ve covered the main rules • They only cover structure • Now to dive deeper June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 38. Equality ≠ Identity • No duplication == not duplicating identity • Are identical twins the same person? • Are two guys named “John” the same guy? • This is important and easy to get wrong • For example … June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 39. How do you get the total of an order? • Assume each recipe has a price • Store total in the order? (hint: no) • Store price on the recipe? (hint: yes) • Is that enough? June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 40. Orders Total June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 41. Calculating the Order Total? SELECT o.order_id, sum(i.price) FROM orders o JOIN orders_recipes orr ON orr.order_id = o.order_id JOIN recipes r ON r.recipe_id = orr.recipe_id GROUP BY o.order_id June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 42. What if the price changes? June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 43. Orders Total June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 44. Calculating the Order Total SELECT o.order_id, sum(orr.price) FROM orders o JOIN orders_recipes orr ON orr.order_id = o.order_id GROUP BY o.order_id June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 45. Equality is not Identity • Order item price isn’t item price • What if the item price changes? • What if you give a discount on the order item? • A subtle, but common bug June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 46. Rule #5 1. Nouns == tables 2. Another table’s ID must have a FK constraint 3. Lists of things get their own table 4. Many-to-many == lookup table (with FKs) 5. Watch for equal values that aren’t identical June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 47. Naming • Names are important • Identical columns should have identical names • Names should hint at use June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 48. Bad Naming SELECT name, 'too cold' FROM areas WHERE temperature < 32; June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 49. ID Names orders.order_id versus orders.id June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 50. ID Names SELECT o.id, sum(i.price) FROM orders o JOIN orders_recipes orr ON orr.order_id = o.id JOIN recipes r on r.id = o.id GROUP BY o.order_id June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 51. ID Names SELECT o.id, sum(i.price) FROM orders o JOIN orders_recipes orr ON orr.order_id = o.id JOIN recipes r on r.id = o.id GROUP BY o.order_id June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 52. Conceptually Similar to … SELECT name FROM customer WHERE id > weight; June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 53. ID Names SELECT thread.* FROM email thread JOIN email selected ON selected.id = thread.id JOIN character recipient ON recipient.id = thread.recipient_id JOIN station_area sa ON sa.id = recipient.id JOIN station st ON st.id = sa.id JOIN star origin ON origin.id = thread.id JOIN star destination ON destination.id = st.id LEFT JOIN route ON ( route.from_id = origin.id AND route.to_id = destination.id ) WHERE selected.id = ? AND ( thread.sender_id = ? OR ( thread.recipient_id = ? AND ( origin.id = destination.id OR ( route.distance IS NOT NULL AND now() >= thread.datesent + ( route.distance * interval '30 seconds' ) )))) ORDER BY datesent ASC, thread.parent_id ASC NULLS FIRST June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 54. Rule #6 1. Nouns == tables 2. Another table’s ID must have a FK constraint 3. Lists of things get their own table 4. Many-to-many == lookup table (with FKs) 5. Watch for equal values that aren’t identical 6. Name columns as descriptively as possible June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 55. Summary • Nouns == tables (*) • FK constraints • Proper naming is important • Your DBAs will thank you • Your apps will be more robust June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 56. June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ ? http://www.slideshare.net/ovid/
    • 57. Bonus Slides! Super-duper important stuff I wasn’t sure I had time to cover because it’s going to make your head hurt. June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 58. Avoid NULL Values • Every column should have a type • NULLs, by definition, are unknown values • Thus, their type is unknown • But … every column should have a type? June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 59. Our employees Table CREATE TABLE employees ( employee_id SERIAL PRIMARY KEY, name CHARACTER VARYING(255) NOT NULL, salary MONEY NULL ); June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 60. Giving Bonuses • $1,000 bonus to all employees • … if they make less than $40,000/year June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 61. Get Employees For Bonus SELECT employee_id, name FROM employee WHERE salary < 40000; June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 62. Bad SQL • Won’t return anyone with a NULL salary • Why is the salary NULL? – What if it’s confidential? – What if they’re a contractor and in that table? – What if they’re an unpaid slave intern? – What if it’s unknown when the data was entered? June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/
    • 63. NULLs tell you nothing supplier_id city s1 ‘London’ June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ part_id city p1 NULL suppliers table parts table Example via “Database In Depth” by C.J. Date
    • 64. NULLs tell you nothing June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ part_id city p1 NULL parts table Example via “Database In Depth” by C.J. Date SELECT part_id FROM parts; SELECT part_id FROM parts WHERE city = city;
    • 65. NULLs tell you nothing supplier_id city s1 ‘London’ June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ part_id city p1 NULL Example via “Database In Depth” by C.J. Date SELECT s.supplier_id, p.part_id FROM suppliers s, parts p WHERE p.city <> s.city -- can’t compare NULL OR p.city <> 'Paris’; -- can’t compare NULL
    • 66. NULLs tell you lies June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/ Example via “Database In Depth” by C.J. Date SELECT s.supplier_id, p.part_id FROM suppliers s, parts p WHERE p.city <> s.city -- can’t compare NULL OR p.city <> 'Paris’; -- can’t compare NULL • We get no rows because we can’t compare a NULL city • The unknown city is Paris or it isn't. • If it’s Paris, the first condition is true • If it’s not Paris, the second condition is true • Thus, the WHERE clause must be true, but it’s not
    • 67. Rule #7 1. Nouns == tables 2. Another table’s ID must have an FK constraint 3. Lists of things get their own table 4. Many-to-many == lookup table (with FKs) 5. Watch for equal values that aren’t identical 6. Name columns as descriptively as possible 7. Avoid NULL columns like the plague June 5, 2014 Copyright 2014, http://www.allaroundtheworld.fr/

    ×