This document provides an introduction to execution plan analysis for performance tuning. It discusses dividing performance tuning into separate areas like hardware, disk performance, and other applications. It then focuses on execution plans, index operations like scans and seeks, join operations like hash and nested loop joins, and parallelism. The objectives are to understand additional execution plan tools, index best practices, join types, and when parallelism is beneficial or not.
2. About me
• SQL Server Team Lead for RDX
• Lead code tuning team within RDX
• RDX works with over 200 customers on
solutions within the SQL Server environment
world wide.
3.
4.
5. Divide and conquer approach
• Eliminating non-problem areas
– Hardware - memory, CPU
– Disk/SAN performance
– Maintenance Tasks - index/statistics maintenance
– Other Applications on server causing performance bottleneck
For more information on tools and techniques used to identify
problem areas check out John Sterrett’s presentation, Performance
tuning for Pirates, http://bit.ly/P3h5Hf
6. Objectives
• Execution plans
– Additional tools
– Things to check before start of analysis
• Indexes used by the SQL optimizer
– Different type of index operations
– Best practices
– Determine what indexes are the optimal ones and the exceptions to
the rules
• Join operations
– Different join type and how they work
– Exceptions to the general best practices
7. Execution Plans in a glance
• How to view an execution plan
– Profiler
– DMVs
– Extended Events (SQL2008+ only)
– Ad hoc queries
• Different views of execution plans
– Graphical view
– XML view
– Text view
8. Before we begin…
• Additional query options (STATISTICS IO/TIME)
• Check optimization levels
• Check if optimizer timed out during plan
generation
• How to regenerate a plan
• Right-to-left approach
9. Index Operations
• Four different base table operations
– Table Scan
– Index Scan - clustered/non-clustered
– Index Seek - clustered/non-clustered
– Lookups - RID/Key
10. Index Operations – Table Scans
• Table Scan – Heap (no clustered index)
– Generally not a best practice for data retrieval
• When Table Scans Are Acceptable
– Small tables
– Table variables
– CTE functionality
– Heavily inserted tables, minimal selects/updates/deletes
11. Index Operations – Index Scans
• Clustered Index Scans
– Same as a table scan but table is sorted on disk
– Generally slightly better than heap operations for CPU usage
– Query could most likely benefit from different index
– Could also mean the large range of data is been selected
12. Index Operations – Index Scans
• Non-Clustered Index Scan
– Less data scanned than a clustered index
– Causes for non-clustered index scan
• Not an optimal index for the query
• Seek predicate is not SARGable
• Dataset returned represents most of the table
13. Index Operations – Index Seeks
• Clustered Index Seek
– Most preferred method for data retrieval
– Downside, only one clustered index can exist on the table
14. Index Operations – Index Seeks
• Non-Clustered Index Seek
– As good as clustered index seek
– Upside, can have multiple non-clustered indexes
– Downside, over indexing tables can cause slowdowns on
writes
15. Index Operations - Lookups
• Lookups – What are they?
– Only on non-clustered index operations
– RID Lookup (Heap) - no clustered index present
– Key Lookup (Clustered) - clustered key lookup
• Eliminating lookups
– Pre SQL2005, create covering indexes
– SQL2005 and later, INCLUDED columns option
• Exceptions to eliminating lookups
– Select every column from the table
17. Join Operations
• Three main logical join operators
– Hash join
– Nested Loop join
– Merge join
18. Join Operations – Hash Join
• Hash joins - How it works…
– Two phases - build phase, probe phase
• Build phase - Scans or computes build input and builds hash table in memory (if
possible)
• Probe phase - For each row of probe input hash value is build or computed than
compared against hash bucket of the build input, either producing matches or not
– Requires at least one equality statement
– Performed when at least one side of the join is not properly indexed
(index scan/table scan)
– Most common join operation
20. Join Operation – Nested Loop
• Nested loop join – How it works
– Also called nested iteration, builds two inputs - inner and outer input
• Outer input (displayed on top in the execution plan)
– Processes row by row against inner output
– A smaller table in the join predicate
• Inner input (on the bottom of the execution plan)
– Scanned for every row of the outer input to produce matches
– A larger table of the two participating in the join
– Has to be properly indexed ( index seeks)
– Usually used when one table in the join is relatively small and the
other is rather large, and at least one table (larger) is properly indexed.
• If those conditions are met generally outperforms hash join
operations
22. Join Operations - Merge
• Merge join – How it works
– Both tables participated in the join must be sorted on the join
columns. If that condition is not met, merge join operation would
require an explicit sort which could be very costly.
– Both inputs are scanned once at the same time, one row at a time,
and if the rows are not matched, the row is discarded. Since both
inputs are sorted, we can guarantee we will not see another possible
match.
24. Parallelism – Good or Bad?
• SQL Server Optimizer determines when to use
– Can be seen in execution plan as one of three exchange operators
• Distribute Streams
• Repartition Streams
• Gather Streams
– When parallel execution is not an option
• The query contains scalar or relational operators that cannot be run in parallel
• Sequential cost of the query is not high enough
• Sequential execution is faster than parallel
– Cost threshold of parallelism and MAXDOP options