Indy pass   writing efficient queries – part 1 - indexing
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Indy pass writing efficient queries – part 1 - indexing

on

  • 1,602 views

Slide deck of the April 20, 2010 presentation to IndyPASS on SQL Server IO internals for performance

Slide deck of the April 20, 2010 presentation to IndyPASS on SQL Server IO internals for performance

Statistics

Views

Total Views
1,602
Views on SlideShare
1,596
Embed Views
6

Actions

Likes
1
Downloads
40
Comments
0

1 Embed 6

http://www.slideshare.net 6

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Indy pass writing efficient queries – part 1 - indexing Presentation Transcript

  • 1. Writing Efficient Queries – Part 1
    Using SQL Server Internals to Improve Data Access
    Eddie Wuerch
    MCT, MCITP
    Principal, Data Management
    ExactTarget
    eddie@indydba.com
  • 2. Disk I/O – Key Points
    Disk I/O (reads and writes) is usually the slowest component of a system
    Compare: Memory and CPU speeds are reported in GHz: billions of actions per second
    Disk I/O rates are simply measured in IOPS: Input/output Operations Per Second
    Fast disks on mostly sequential workloads can get 100-150 IOPS
    Because of disk access rates, our first tuning goal is to reduce overall I/O
    …so let’s look at those data pages, then examine how to process less of them
  • 3. Disk I/O – Key Points - II
    SQL Server is not a black box
    Most data is in structured storage
    There are many ways to access the data, some ways are significantly faster – and cause less impact on other processes - than others
    Understanding SQL Server data storage internals will guide you to the faster ways
  • 4. Data Storage in SQL Server
    The base unit of SQL Server data storage is the page
    All data – system and user – is stored in pages
    Each page is 8KB (8192 bytes)
    Pages are allocated from files in 64KB (8-page) extents
  • 5. Page Processing
    Pages are read from disk and processed in memory as an entire 8KB unit
    Extents are often read in from disk as a single block to reduce I/O
    All data is processed in memory, pulled from disk first (processing put on hold) if data is not in memory
  • 6. Page Types
    System Page Types
    Space Management: File Header, PFS, GAM, SGAM
    Change Management: DCM, BCM
    Data Page Types
    In-row data
    Index
    LOB data and Row-overflow data
    All pages are 8KB
  • 7. In-Row Data Pages
    96-byte Page Header
    Page Header
    • Row Data
    • 8. Rows written serially
    • 9. Starts at 97th byte
    Row 1…
    Row 2…
    Row 3…
    Row 4…
    • Row-offset table
    • 10. Starts at end of page, moves backwards
    • 11. Records first-byte offset of each row
    4… 3… 2… 1…
  • 12. Disk Access Methods
    Think of a phone book, with each entry as a record
    Ordered by Last Name, First Name, MI
    Two ways to find a record:
    Use Last Name, First Name to find a number (Index Seek)
    Look through the entire phone book, one page at a time, scanning each row for data (Table Scan)
  • 13. Clustered Index
    Represents the table itself
    Index specifies the physical ordering of that data
    Only 1 allowed per table
    May be unique, does not have to be the primary key
    Non-clustered index
    Additional index of data
    Over 200 allowed
    May be unique
    Index Types
  • 14. The phone book example
    If a table has a clustered index, the pointer to each row in the table is the clustered index key
    The leaf level of the nonclustered index contains the nonclustered keys and the clustered index keys
    Nonclustered indexes may also include additional non-indexed columns, will be stored at the leaf level of the index
  • 15. Index Pages
    A-K : Page 4
    L-U : Page 5
    V-Z : Page 6
    4
    5
    6
    A : Page 22
    B : Page 23
    C : Page 24
    D….
    Baa : Page 276
    Baba : Page 277
    Base : Page 278
    Ba…
    22
    23
    24
    25
    26
    27
    274
    275
    276
    277
    278
    279
    280
    281
    282
    283
  • 16. Index Lookups
  • 17. Index Lookups - revisted
    Nonclustered index
    Separate trip through the clustered index for each ncl entry!
    Clustered index (table)
  • 18. Operational Join Types
    Merge Joins
    Hash Joins
    Loop Joins
  • 19. Join Type Comparison
    Loop
    Merge
    Hash
    • When the other two can’t be used
    • 20. One trip through one table
    • 21. One trip through the other table for each entry in the first table
    • 22. Generally the slowest of the three types
    One trip through each table
    Requires indexes on both sides, at least one of them must be unique
    Usually the fastest join type
    Works well for very large joins
    Builds join data in tempdb
  • 23. Join and Indexing Tips
    When defining an index, if the data is unique, then declare the index as unique
    Join on keys
    Provide arguments in WHERE clauses to match available indexes
    Cluster tables on range scans
    Look for covering indexes
  • 24. So How Do I Know?
    SET STATISTICS IO ON
  • 25. So How Do I Know?
  • 26. So How Do I Know?
    sys.dm_db_index_usage_stats
    User_seeks
    User_scans
    User_lookups
    User_updates
    Sys.dm_db_missing_index_*
    Not magic, has limitations
    Many similar index entries with different INCLUDE statements may indicate a need to revisit the clustered index design
  • 27. So How Do I Know?
    Scan-indicating waits
    Lots of PAGEIOLATCH_SH and PAGEIOLATCH_EX waits are generated by tables scans that read from disk
    CX_PACKET waits – related to parallellism often caused by scanning large tables (don’t reduce MAXDOP: fix the scan!)
    Other processes with SOS_SCHEDULER_YIELD or high signal wait times may be mitigated by reducing CPU load of scans
  • 28. So How Do I Know?
    TempDB activity in instances without much use of temp tables or table variables
    SELECT * FROM sys.dm_io_virtual_file_stats(DB_ID(‘TempDB’), NULL)
    Must track over time, perform time-slice analysis
    May indicate additional worktable sort and hash-match activity
    Tracking this for all of your databases shows the amount of I/O your systems are performing, and if the disk systems are keeping up
  • 29. Resources
    Microsoft White Papers
    SQL Server 2000 I/O Basics (http://technet.microsoft.com/en-us/library/cc966500.aspx)
    SQL Server I/O Basics, Chapter 2 (http://technet.microsoft.com/en-us/library/cc917726.aspx)
    SQL Server Waits and Queues (download) (http://technet.microsoft.com/en-us/library/cc966413.aspx)
    The Waits and Queues document is highly recommended tuning or analyzing workloads
  • 30. Resources
    Inside SQL Server Book Series
    SQL 2005
    The Storage Engine (Kalen Delaney)
    Query Tuning and Optimization (Delany, et. al.)
    T-SQL Querying (Ben-Gan, Kollar, Sarka)
    SQL 2008
    Microsoft SQL Server 2008 Internals (Delaney, Randal, Tripp, Cunningham)
    T-SQL Querying (Ben-Gan, Kollar, Sarka)
  • 31. Questions?
    Email: eddie@indydba.com