Your SlideShare is downloading. ×
Elegant & Efficient Database Design<br />	Tim Allen		Becky Sweger<br />
	Naming Conventions<br /><ul><li>Joshua
Jana
John-David
Jill
Jessa
Jinger
Josiah
Joy-Anna
Jedidiah
Jeremiah
Jason
James
Justin
Jackson
Johannah
Jennifer</li></li></ul><li>	Naming Conventions<br />Pick one and enforce it: camelCase, PascalCase, under_scores<br />Why?...
	DB Normalization<br />
	DB Normalization<br />The key, the whole key, and nothing but the key.  <br />So help me Codd.<br />
	DB Normalization<br />http://en.wikipedia.org/wiki/File:Insertion_anomaly.svg<br />
	Why normalize?<br />Avoid data duplication<br />Let end users make their own changes<br />Avoid data anomalies<br />Third...
	Indexing<br />Find a book in Van Pelt without a card catalog…<br />
	Indexing: Data Pages<br />
	Indexing: Clustered B-Tree<br />Indexes in SQL Server are organized as B-trees<br />member name clustered index  (image f...
	Indexing: Non-Clustered B-Tree<br />member id non-clustered index  (image from Clustered Indexes vs. Nonclustered Indexes...
	Indexing: Other Types<br />Unique<br />Full-text<br />Included columns<br />Indexed views<br />XML<br />Filtered (new for...
	Indexing: Primary Keys<br />Primary key = unique index (clustered or non-clustered)<br />
	Indexing considerations: tables<br /><ul><li>Clustered index on every table</li></ul>member name non-clustered index  (im...
	Indexing considerations: tables<br />Integer primary key on every table<br />/* take checkpoint, clear buffers & cache */...
	Indexing considerations: tables<br />Results<br />
       Indexing considerations: columns<br />
       Indexing considerations: columns<br />Columns you join on: indexed integers are your friend!<br />How are the colum...
       Indexing Considerations: Yes! No!<br />
       Indexing Considerations: Yes! No!<br />TINY: 8 bits (0 – 255): 01010101<br />SMALL INTEGER: 16 bits (0 – 65536): 01...
Upcoming SlideShare
Loading in...5
×

Elegant and Efficient Database Design

3,815

Published on

A presentation that Tim Allen, a colleague, and I gave to Wharton Computing. We cover RDBMS fundamentals such as design, indexing, etc.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,815
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "Elegant and Efficient Database Design"

  1. 1. Elegant & Efficient Database Design<br /> Tim Allen Becky Sweger<br />
  2. 2.
  3. 3. Naming Conventions<br /><ul><li>Joshua
  4. 4. Jana
  5. 5. John-David
  6. 6. Jill
  7. 7. Jessa
  8. 8. Jinger
  9. 9. Josiah
  10. 10. Joy-Anna
  11. 11. Jedidiah
  12. 12. Jeremiah
  13. 13. Jason
  14. 14. James
  15. 15. Justin
  16. 16. Jackson
  17. 17. Johannah
  18. 18. Jennifer</li></li></ul><li> Naming Conventions<br />Pick one and enforce it: camelCase, PascalCase, under_scores<br />Why?<br />Cleaner code<br />Logical joins<br />Sanity of future developers and future you<br />Explicitly name constraints<br />Avoid keywords as column names<br />WRDS Fail: group, client, school, subscriber, and institution are all used for the same data entity<br />
  19. 19. DB Normalization<br />
  20. 20. DB Normalization<br />The key, the whole key, and nothing but the key. <br />So help me Codd.<br />
  21. 21. DB Normalization<br />http://en.wikipedia.org/wiki/File:Insertion_anomaly.svg<br />
  22. 22. Why normalize?<br />Avoid data duplication<br />Let end users make their own changes<br />Avoid data anomalies<br />Third-party tools rely on normalized data<br />
  23. 23. Indexing<br />Find a book in Van Pelt without a card catalog…<br />
  24. 24. Indexing: Data Pages<br />
  25. 25. Indexing: Clustered B-Tree<br />Indexes in SQL Server are organized as B-trees<br />member name clustered index (image from Clustered Indexes vs. Nonclustered Indexes in SQL Server:http://tr.im/AeU5)<br />
  26. 26. Indexing: Non-Clustered B-Tree<br />member id non-clustered index (image from Clustered Indexes vs. Nonclustered Indexes in SQL Server:http://tr.im/AeU5)<br />
  27. 27. Indexing: Other Types<br />Unique<br />Full-text<br />Included columns<br />Indexed views<br />XML<br />Filtered (new for 2008)<br />Spatial (new for 2008)<br />http://msdn.microsoft.com/en-us/library/ms175049.aspx<br />
  28. 28. Indexing: Primary Keys<br />Primary key = unique index (clustered or non-clustered)<br />
  29. 29. Indexing considerations: tables<br /><ul><li>Clustered index on every table</li></ul>member name non-clustered index (image from Clustered Indexes vs. Nonclustered Indexes in SQL Server:http://tr.im/AeU5)<br />
  30. 30. Indexing considerations: tables<br />Integer primary key on every table<br />/* take checkpoint, clear buffers & cache */<br />SELECT s.term, s.section_id, COUNT(penn_id)<br />FROM flat_section s JOIN flat_enrollment e<br />ON s.section_id = e.section_id<br />AND s.term = e.term<br />GROUP BY s.term, s.section_id<br />
  31. 31. Indexing considerations: tables<br />Results<br />
  32. 32. Indexing considerations: columns<br />
  33. 33. Indexing considerations: columns<br />Columns you join on: indexed integers are your friend!<br />How are the columns used in queries?<br />Cardinality: 1:1, 1:many, many:many<br />Data type<br />Indexing multiple columns: moderation<br />Goal 1: performance!<br />Goal 2: smallest index file possible.<br />
  34. 34. Indexing Considerations: Yes! No!<br />
  35. 35. Indexing Considerations: Yes! No!<br />TINY: 8 bits (0 – 255): 01010101<br />SMALL INTEGER: 16 bits (0 – 65536): 0101010101010101<br />INTEGER: 32 bits (0 – 16777215): 01010101010101010101010101010101<br />VARCHAR ‘philadelphia’: 104 bits, at least (encoding UTF-8): 01010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101<br />Consider joining 4 tables on ‘philadelphia’: 4 initial lookups on indexes, 8 times as bulky and less cacheable as a small integer.<br />Index on VARCHARs only when needed<br />Keeps index files smaller and less chance of fragmentation; fragmented index files make Matt Frew’s life hellish (no, that is NOT a positive!)<br />Consider purpose: don’t index for a one time script or report<br />
  36. 36. Indexing Example: WRDS Queries<br />
  37. 37. Indexing Example: WRDS Queries<br />Large table: millions of rows<br />A record for each WRDS query<br />VARCHAR columns that should be INTEGERS<br />Report requests for subscribers asking number of queries for a date range group by library and file<br />Before indexing, full table scan: 30 secs per query<br />Index added: subscriber, query date, library, file<br />After indexing, without table scan: 0.02 secs per query<br />
  38. 38. Indexing Tools: SSMS<br />From the Query menu:<br />SET STATISTICS TIME, SET STATISTICS IO<br />Include Actual Execution Plan<br />
  39. 39. Indexing Tools: DMV<br />Dynamic Management Views & Functions: http://msdn.microsoft.com/en-us/magazine/cc135978.aspx<br />
  40. 40. Indexing Tools: Tuning Advisor<br />Analyzes workloads<br />
  41. 41. Best Practices: Blame!<br />Blame the SQL Admins!<br />
  42. 42. Best Practices: Data Types<br />Don’t skimp on column length: Yes/No? Maybe. Open/Closed? Under construction. Black/White? Grey.<br />Know required level of precision, and leave yourself room to grow.<br />Accuracy to the day, minute, millisecond?<br />
  43. 43. Best Practices: Deletion<br />Logical deletes vs. physical deletes<br />
  44. 44. Best Practices: work the DB<br />Foreign Keys<br />Unique indexes<br />Check constraints<br />Default constraints<br />Triggers<br />image courtesy of My New Filing Technique is Unstoppable by David Rees: http://www.mnftiu.cc/2002/11/26/filing-009/ (nsfw)<br />
  45. 45. Best Practices: master tables<br />
  46. 46. Best Practices: Crunch Time!<br />“Temporary” projects<br />Balance between today’s pragmatism and tomorrow’s pain<br />Code review sooner<br />
  47. 47. Best Practices: Experiment<br />Experiment in SQL Server Management Studio to improve your times and execution plans<br />
  48. 48. Be Opinionated!<br />Solicit feedback on database design before coding starts, not after.<br />Ask for opinions, and share your opinions!<br />More eyes = better database design<br />More ideas = better database design<br />Anyone have any tips… or questions?<br />
  49. 49. Resources<br />SQL Server Books Online http://msdn.microsoft.com/en-us/library/ms130214.aspx<br />SQL Server 2008 Query Performance Turning Distilled by Grant Fritchey and Sajal Dam<br />Comparing Tables Organized with Clustered Indexes versus Heapshttp://technet.microsoft.com/en-us/library/cc917672.aspx<br />MS Index Design Guidelineshttp://msdn.microsoft.com/en-us/library/ms191195.aspx<br />Clustered Indexes vs. Nonclustered Indexes in SQL Serverhttp://digcode.com/default.aspx?page=ed51cde3-d979-4daf-afae-fa6192562ea9&article=443e9774-7d26-422d-a2f1-dbcafbb1e1fc&pc=5<br />SQL Server Execution Plans (free e-book, registration required)http://www.sqlservercentral.com/articles/books/65831/<br />Uncover Hidden Data to Optimize Application Performancehttp://msdn.microsoft.com/en-us/magazine/cc135978.aspx<br />My New Filing Technique is Unstoppable (NSFW)http://www.mnftiu.cc/2002/11/26/filing-001/<br />SQL in the Wildhttp://sqlinthewild.co.za/<br />Journey to SQL Authority With Pinal Davehttp://blog.sqlauthority.com/<br />

×