This document discusses indexing and query optimization in SQL Server. It provides an overview of indexes including clustered and non-clustered indexes. It describes how data is stored at the page and extent level and differences between tables with and without clustered indexes. The document also outlines the query optimization process including parsing, optimization, execution and the cost-based optimizer. Finally, it reviews common execution plan operators like table scans, index scans and seeks and when they would be considered good or bad.
Qualitem - Large List Support - SharePoint Saturday
SAG_Indexing and Query Optimization
1. SAG – Indexing and Query
Optimization
Vaibhav Jain
vjain44@csc.com
Ext : 706224
2. Indexing – A Refresher
• Introducing Indexes
What are Indexes?
What are the types of Indexes?
What do these terms inside SQL Server mean?
– Pages, Extents
– Heap
– Clustered Indexes
– Non-Clustered Indexes
3. Page, Extents
• The fundamental unit of data storage in SQL Server is the PAGE
(8k)
This means SQL Server databases have 128 pages per megabyte
Each page begins with a 96-byte header
Maximum amount of data contained in a single row on a page is 8,060
bytes
• Extents are a collection of eight physically contiguous pages
4. HEAP
• A heap is a table without a clustered index
• The data rows are not stored in any particular order
• Data pages are not linked in a linked list
5. Clustered Index
• The data rows are stored in order based on the clustered index key
• The clustered index is implemented as a B-tree index structure
• Data pages in the leaf level, are linked in a doubly-linked list
• Clustered indexes have one row in sys.partitions, with index_id = 1
7. Non-Clustered Index
• Nonclustered indexes have a B-tree index structure
similar to the one in clustered indexes
• The difference is that nonclustered indexes do not
affect the order of the data rows.
• Each index row contains the nonclustered key value, a
row locator and any included, or nonkey, columns
10. Query Processing and Optimization
• Cost Based Optimizer
• Disk IO Cost
• CPU Cost
• Memory Cost – Very insignificant
• Indexes
• Selects optimal indexes to be used
• Query Algorithms
• Converts Logical operation into physical operations
12. Table Scan
• When
Table without clustered index is accessed
• Good or Bad*
Can’t decide
• Action Item
Create Clustered Index
13. Clustered Index Scan
• When
Table with clustered index is accessed but query
doesn’t uses the clustered index key
• Good or Bad*
Bad unless large data with most columns and rows
retrieved
• Action Item
Evaluate Clustered Index Keys
14. Clustered Index Seek
• When
Table with clustered index is accessed and query
locates specific rows in B+ tree
• Good or Bad*
Good
• Action Item
Evaluate possibility of non-clustered index
15. Non-Clustered Index Scan
• When
Columns part of non-clustered index accessed in
query
• Good or Bad*
Bad unless large data with most columns and rows
retrieved
• Action Item
Create more refined non-clustered index
16. Non-Clustered Index Seek
• When
Columns part of non-clustered index accessed in
query and rows located in B+ tree
• Good or Bad*
Good
• Action Item
Further evaluate other operators
17. Lookups
• When
Query Optimizer uses non-clustered index to search
few column data and base table for other columns data
• Good or Bad*
Good
• Action Item
Included Index or Covered Index