Successfully reported this slideshow.
Writing Efficient Queries – Part 1<br />Using SQL Server Internals to Improve Data Access<br />Eddie Wuerch<br />MCT, MCIT...
Disk I/O – Key Points<br />Disk I/O (reads and writes) is usually the slowest component of a system<br />Compare: Memory a...
Disk I/O – Key Points - II<br />SQL Server is not a black box<br />Most data is in structured storage<br />There are many ...
Data Storage in SQL Server<br />The base unit of SQL Server data storage is the page<br />All data – system and user – is ...
Page Processing<br />Pages are read from disk and processed in memory as an entire 8KB unit<br />Extents are often read in...
Page Types<br />System Page Types<br />Space Management: File Header, PFS, GAM, SGAM<br />Change Management: DCM, BCM<br /...
In-Row Data Pages<br />96-byte Page Header<br />Page Header<br /><ul><li>Row Data
Rows written serially
Starts at 97th byte</li></ul>Row 1…<br />Row 2…<br />Row 3…<br />Row 4…<br /><ul><li>Row-offset table
Starts at end of page, moves backwards
Records first-byte offset of each row</li></ul>4… 3… 2… 1…<br />
Disk Access Methods<br />Think of a phone book, with each entry as a record<br />Ordered by Last Name, First Name, MI<br /...
Clustered Index<br />Represents the table itself<br />Index specifies the physical ordering of that data<br />Only 1 allow...
The phone book example<br />If a table has a clustered index, the pointer to each row in the table is the clustered index ...
Index Pages<br />A-K : Page 4<br />L-U : Page 5<br />V-Z : Page 6 <br />4<br />5<br />6<br />A : Page 22<br />B : Page 23<...
Index Lookups<br />
Index Lookups - revisted<br />Nonclustered index<br />Separate trip through the clustered index for each ncl entry!<br />C...
Operational Join Types<br />Merge Joins<br />Hash Joins<br />Loop Joins<br />
Join Type Comparison<br />Loop<br />Merge<br />Hash<br /><ul><li>When the other two can’t be used
One trip through one table
One trip through the other table for each entry in the first table
Generally the slowest of the three types</li></ul>One trip through each table<br />Requires indexes on both sides, at leas...
Join and Indexing Tips<br />When defining an index, if the data is unique, then declare the index as unique<br />Join on k...
So How Do I Know?<br />SET STATISTICS IO ON<br />
Upcoming SlideShare
Loading in …5
×

Indy pass writing efficient queries – part 1 - indexing

1,459 views

Published on

Slide deck of the April 20, 2010 presentation to IndyPASS on SQL Server IO internals for performance

Published in: Technology
  • Be the first to comment

Indy pass writing efficient queries – part 1 - indexing

  1. 1. Writing Efficient Queries – Part 1<br />Using SQL Server Internals to Improve Data Access<br />Eddie Wuerch<br />MCT, MCITP<br />Principal, Data Management<br />ExactTarget<br />eddie@indydba.com<br />
  2. 2. Disk I/O – Key Points<br />Disk I/O (reads and writes) is usually the slowest component of a system<br />Compare: Memory and CPU speeds are reported in GHz: billions of actions per second<br />Disk I/O rates are simply measured in IOPS: Input/output Operations Per Second<br />Fast disks on mostly sequential workloads can get 100-150 IOPS<br />Because of disk access rates, our first tuning goal is to reduce overall I/O<br />…so let’s look at those data pages, then examine how to process less of them<br />
  3. 3. Disk I/O – Key Points - II<br />SQL Server is not a black box<br />Most data is in structured storage<br />There are many ways to access the data, some ways are significantly faster – and cause less impact on other processes - than others<br />Understanding SQL Server data storage internals will guide you to the faster ways<br />
  4. 4. Data Storage in SQL Server<br />The base unit of SQL Server data storage is the page<br />All data – system and user – is stored in pages<br />Each page is 8KB (8192 bytes)<br />Pages are allocated from files in 64KB (8-page) extents<br />
  5. 5. Page Processing<br />Pages are read from disk and processed in memory as an entire 8KB unit<br />Extents are often read in from disk as a single block to reduce I/O<br />All data is processed in memory, pulled from disk first (processing put on hold) if data is not in memory<br />
  6. 6. Page Types<br />System Page Types<br />Space Management: File Header, PFS, GAM, SGAM<br />Change Management: DCM, BCM<br />Data Page Types<br />In-row data <br />Index <br />LOB data and Row-overflow data<br />All pages are 8KB<br />
  7. 7. In-Row Data Pages<br />96-byte Page Header<br />Page Header<br /><ul><li>Row Data
  8. 8. Rows written serially
  9. 9. Starts at 97th byte</li></ul>Row 1…<br />Row 2…<br />Row 3…<br />Row 4…<br /><ul><li>Row-offset table
  10. 10. Starts at end of page, moves backwards
  11. 11. Records first-byte offset of each row</li></ul>4… 3… 2… 1…<br />
  12. 12. Disk Access Methods<br />Think of a phone book, with each entry as a record<br />Ordered by Last Name, First Name, MI<br />Two ways to find a record:<br />Use Last Name, First Name to find a number (Index Seek)<br />Look through the entire phone book, one page at a time, scanning each row for data (Table Scan)<br />
  13. 13. Clustered Index<br />Represents the table itself<br />Index specifies the physical ordering of that data<br />Only 1 allowed per table<br />May be unique, does not have to be the primary key<br />Non-clustered index<br />Additional index of data<br />Over 200 allowed<br />May be unique<br />Index Types<br />
  14. 14. The phone book example<br />If a table has a clustered index, the pointer to each row in the table is the clustered index key<br />The leaf level of the nonclustered index contains the nonclustered keys and the clustered index keys<br />Nonclustered indexes may also include additional non-indexed columns, will be stored at the leaf level of the index<br />
  15. 15. Index Pages<br />A-K : Page 4<br />L-U : Page 5<br />V-Z : Page 6 <br />4<br />5<br />6<br />A : Page 22<br />B : Page 23<br />C : Page 24 <br />D….<br />Baa : Page 276<br />Baba : Page 277<br />Base : Page 278<br />Ba…<br />22<br />23<br />24<br />25<br />26<br />27<br />274<br />275<br />276<br />277<br />278<br />279<br />280<br />281<br />282<br />283<br />
  16. 16. Index Lookups<br />
  17. 17. Index Lookups - revisted<br />Nonclustered index<br />Separate trip through the clustered index for each ncl entry!<br />Clustered index (table)<br />
  18. 18. Operational Join Types<br />Merge Joins<br />Hash Joins<br />Loop Joins<br />
  19. 19. Join Type Comparison<br />Loop<br />Merge<br />Hash<br /><ul><li>When the other two can’t be used
  20. 20. One trip through one table
  21. 21. One trip through the other table for each entry in the first table
  22. 22. Generally the slowest of the three types</li></ul>One trip through each table<br />Requires indexes on both sides, at least one of them must be unique<br />Usually the fastest join type<br />Works well for very large joins<br />Builds join data in tempdb<br />
  23. 23. Join and Indexing Tips<br />When defining an index, if the data is unique, then declare the index as unique<br />Join on keys<br />Provide arguments in WHERE clauses to match available indexes<br />Cluster tables on range scans<br />Look for covering indexes<br />
  24. 24. So How Do I Know?<br />SET STATISTICS IO ON<br />
  25. 25. So How Do I Know?<br />
  26. 26. So How Do I Know?<br />sys.dm_db_index_usage_stats<br />User_seeks<br />User_scans<br />User_lookups<br />User_updates<br />Sys.dm_db_missing_index_*<br />Not magic, has limitations<br />Many similar index entries with different INCLUDE statements may indicate a need to revisit the clustered index design<br />
  27. 27. So How Do I Know?<br />Scan-indicating waits<br />Lots of PAGEIOLATCH_SH and PAGEIOLATCH_EX waits are generated by tables scans that read from disk<br />CX_PACKET waits – related to parallellism often caused by scanning large tables (don’t reduce MAXDOP: fix the scan!)<br />Other processes with SOS_SCHEDULER_YIELD or high signal wait times may be mitigated by reducing CPU load of scans<br />
  28. 28. So How Do I Know?<br />TempDB activity in instances without much use of temp tables or table variables<br />SELECT * FROM sys.dm_io_virtual_file_stats(DB_ID(‘TempDB’), NULL)<br />Must track over time, perform time-slice analysis<br />May indicate additional worktable sort and hash-match activity<br />Tracking this for all of your databases shows the amount of I/O your systems are performing, and if the disk systems are keeping up<br />
  29. 29. Resources<br />Microsoft White Papers<br />SQL Server 2000 I/O Basics (http://technet.microsoft.com/en-us/library/cc966500.aspx)<br />SQL Server I/O Basics, Chapter 2 (http://technet.microsoft.com/en-us/library/cc917726.aspx)<br />SQL Server Waits and Queues (download) (http://technet.microsoft.com/en-us/library/cc966413.aspx)<br />The Waits and Queues document is highly recommended tuning or analyzing workloads<br />
  30. 30. Resources<br />Inside SQL Server Book Series<br />SQL 2005<br />The Storage Engine (Kalen Delaney)<br />Query Tuning and Optimization (Delany, et. al.)<br />T-SQL Querying (Ben-Gan, Kollar, Sarka)<br />SQL 2008<br />Microsoft SQL Server 2008 Internals (Delaney, Randal, Tripp, Cunningham)<br />T-SQL Querying (Ben-Gan, Kollar, Sarka)<br />
  31. 31. Questions?<br />Email: eddie@indydba.com<br />

×