Understanding Indexes
for Writing Better
Queries
Janice Gerbrandt
Agenda
How Data is
stored
Types of indexes Writing queries
that use indexes
About Me
I design and implement database solutions for an
ERP software solution (supply chain
management).
Twitter: @gerbyj
Email: janice.gerbrandt@gmail.com
About Me
About Me
Why this session?
tSQL is a Declarative Language
When I write a query in SQL Server, I only tell
the database engine WHAT I want.
Eg: “I want the names of all the people with
the last name ‘Simpson’”
tSQL is a Declarative Language
“I want the names of all the people with the
last name ‘Simpson’”
select FirstName, LastName
from Person
where LastName = 'Simpson'
Why are some queries slow?
select FirstName, LastName
from Person.Person
where LastName = 'Simpson'
vs
select FirstName, LastName
from Person.Person
where FirstName = 'Kim'
Both Queries returned 10 rows
We’re spoiled in 2020
Index analogies used to be easy!
We’re spoiled in 2020
But we still eat, right?
Finding a good recipe on the internet can
take longer than cooking it.
Luckily they still make cookbooks!
A Tale of 3 Cookbooks
The Search for the Perfect Pot Roast Recipe
Book One – Aunt Gertrude’s Collection
Book Two – Basic Cookbook
Book Three – The Canadian Living Cookbook
Book Three – The Canadian Living Cookbook
How long did it take to find that Pot
Roast recipe?
 Book 1: Scanned every page in the book
 Book 2: Looked in an index for the first
page of section, turned to that page, then
scanned each page of the section
 Book 3: Located list of recipes in an index,
scanned that list, then turned to that page
We came here to talk
about databases, right?
How do we find records in tables?
Similar to how we find recipes in cookbooks:
 Scan
 Seek, then Scan
 Seek
How are tables structured?
As Pages (8 KB)
in the data
files, organized
into Extents (8
page chunks)
Page Types
 Data
 Index
 Text/Image
 Various types of Maps
Table Data Structure: Heap
 When a table doesn’t have a defined order
its data rows are stored in an unordered
structure.
 Data order cannot be predicted.
Table Data Structure: Clustered Index (CI)
 When a table does have a defined order its
data rows are stored in an ordered structure
sorted by its key values.
 Key values are the columns included in the
index definition (eg: MealType)
 There can be only one clustered index per
table, because the data rows themselves can
be stored in only one order.
Optional Index: Nonclustered Index (NCI)
 An ordered structure sorted by its key
values that is separate from the data rows.
 Entries contain the key values (eg:
Keyword) and a pointer to the data row (ie:
the row locator).
Heap
id index_id = 0 first_iam_page
Index
Allocation
Map
(IAM)
Page Header
Data Rows
Page Header
Data Rows
Page Header
Data Rows
Clustered
Index
id index_id = 1 root_page
Previous Page
Next Page
Index Rows
Previous Page
Next Page
Index Rows
Previous Page
Next Page
Index Rows
Previous Page
Next Page
Index Rows
Previous Page
Next Page
Data Rows
Previous Page
Next Page
Data Rows
Previous Page
Next Page
Data Rows
Root
Node
Intermediate
Node
Leaf
Node
Traversing the Tree
 Lets find ‘Simpson, Kim’ in the Employee table
 Key values are LastName, FirstName
 Root node:
Aaronson, Alan
Fairchild, Meagan
Morgan, Beverly
Thompson, Sam
Traversing the Tree
 Lets find ‘Simpson, Kim’ in the Employee table
 Key values are LastName, FirstName
 Intermediate node:
Morgan, Beverly
Potter, Henry
Ramsay, Delores
Smith, Sam
Traversing the Tree
 Lets find ‘Simpson, Kim’ in the Employee table
 Key values are LastName, FirstName
 Leaf node:
Ramsay, Delores, (1992-04-23, Sr Engineer)
Rumsfeld, Dan, (2016-03-22, Engineer)
Sanderson, Betty, (2019-08-19, Jr Engineer)
Simpson, Kim, (2014-01-04, Engineer)
Why are some queries slow?
select FirstName, LastName
from Person.Person
where LastName = 'Simpson'
vs
select FirstName, LastName
from Person.Person
where FirstName = 'Kim'
Nonclustered
Index
id index_id > 1 root_page
Previous Page
Next Page
Index Rows
Previous Page
Next Page
Index Rows
Previous Page
Next Page
Index Rows
Previous Page
Next Page
Index Rows
Previous Page
Next Page
Key Values
Previous Page
Next Page
Key Values
Previous Page
Next Page
Key Values
Root
Node
Intermediate
Node
Leaf
Node
Nonclustered Index
on Heap
Previous Page
Next Page
Key Values Nonclustered Index
Leaf Node
IAM
Page Header
Data Rows
Page Header
Data Rows
Page Header
Data Rows
Nonclustered Index
on Clustered Index
Previous Page
Next Page
Key Values Nonclustered Index
Leaf Node
Previous Page
Next Page
Index Rows
Previous Page
Next Page
Index Rows
Previous Page
Next Page
Index Rows
Previous Page
Next Page
Index Rows
Previous Page
Next Page
Data Rows
Previous Page
Next Page
Data Rows
Previous Page
Next Page
Data Rows
Finding Data Means
Reading Pages
Indexes = Good
Indexable Queries: Finding Data
 Check that the columns tested by your
WHERE clause are in an index
 Ditto for: JOIN conditions, HAVING clause
 Don’t hide the column values to make them
unusable by the index
Implicit Indexes
 Primary Key Constraints (PK)
 Unique Key Constraints (AK)
Both require uniqueness
Your only option on table variables
SARGable = Search ARGument ABLE
 A SARGable predicate is one where SQL
server can isolate the single value or range
of index key values to process
 Functions and operators can act like a black
box around the column value, so the
database engine can’t use it in an index
and must scan -> Non-SARGable
Are any functions SARGable?
Not many:
 LIKE (without a leading %)
 =, >, >=, <, <=, IN, BETWEEN
 IS NULL
 Some Date functions
 User Defined Functions that are basically
parameterized views
Indexable Queries: Clustering Data
 Check that all the columns you’re asking
for are covered by the index
 Pay attention to the order of key values in
the index. Even non-SARGable conditions
aren’t that bad if they’re grouped together
 Key values of the Clustered Index will be on
the leaf pages of your Nonclustered Index
Indexable Queries: Sorting Data
 Check that your sort order is supported by
the index.
 Ditto for: GROUP BY
 It works in reverse, too
Thank you!

Geek Sync | Understand Indexes to Write Better Queries

  • 1.
    Understanding Indexes for WritingBetter Queries Janice Gerbrandt
  • 2.
    Agenda How Data is stored Typesof indexes Writing queries that use indexes
  • 3.
    About Me I designand implement database solutions for an ERP software solution (supply chain management). Twitter: @gerbyj Email: janice.gerbrandt@gmail.com
  • 4.
  • 5.
  • 6.
  • 7.
    tSQL is aDeclarative Language When I write a query in SQL Server, I only tell the database engine WHAT I want. Eg: “I want the names of all the people with the last name ‘Simpson’”
  • 8.
    tSQL is aDeclarative Language “I want the names of all the people with the last name ‘Simpson’” select FirstName, LastName from Person where LastName = 'Simpson'
  • 9.
    Why are somequeries slow? select FirstName, LastName from Person.Person where LastName = 'Simpson' vs select FirstName, LastName from Person.Person where FirstName = 'Kim'
  • 10.
  • 11.
    We’re spoiled in2020 Index analogies used to be easy!
  • 12.
  • 13.
    But we stilleat, right? Finding a good recipe on the internet can take longer than cooking it. Luckily they still make cookbooks!
  • 14.
    A Tale of3 Cookbooks The Search for the Perfect Pot Roast Recipe
  • 15.
    Book One –Aunt Gertrude’s Collection
  • 16.
    Book Two –Basic Cookbook
  • 17.
    Book Three –The Canadian Living Cookbook
  • 18.
    Book Three –The Canadian Living Cookbook
  • 19.
    How long didit take to find that Pot Roast recipe?  Book 1: Scanned every page in the book  Book 2: Looked in an index for the first page of section, turned to that page, then scanned each page of the section  Book 3: Located list of recipes in an index, scanned that list, then turned to that page
  • 20.
    We came hereto talk about databases, right?
  • 21.
    How do wefind records in tables? Similar to how we find recipes in cookbooks:  Scan  Seek, then Scan  Seek
  • 22.
    How are tablesstructured? As Pages (8 KB) in the data files, organized into Extents (8 page chunks)
  • 23.
    Page Types  Data Index  Text/Image  Various types of Maps
  • 24.
    Table Data Structure:Heap  When a table doesn’t have a defined order its data rows are stored in an unordered structure.  Data order cannot be predicted.
  • 25.
    Table Data Structure:Clustered Index (CI)  When a table does have a defined order its data rows are stored in an ordered structure sorted by its key values.  Key values are the columns included in the index definition (eg: MealType)  There can be only one clustered index per table, because the data rows themselves can be stored in only one order.
  • 26.
    Optional Index: NonclusteredIndex (NCI)  An ordered structure sorted by its key values that is separate from the data rows.  Entries contain the key values (eg: Keyword) and a pointer to the data row (ie: the row locator).
  • 27.
    Heap id index_id =0 first_iam_page Index Allocation Map (IAM) Page Header Data Rows Page Header Data Rows Page Header Data Rows
  • 28.
    Clustered Index id index_id =1 root_page Previous Page Next Page Index Rows Previous Page Next Page Index Rows Previous Page Next Page Index Rows Previous Page Next Page Index Rows Previous Page Next Page Data Rows Previous Page Next Page Data Rows Previous Page Next Page Data Rows Root Node Intermediate Node Leaf Node
  • 29.
    Traversing the Tree Lets find ‘Simpson, Kim’ in the Employee table  Key values are LastName, FirstName  Root node: Aaronson, Alan Fairchild, Meagan Morgan, Beverly Thompson, Sam
  • 30.
    Traversing the Tree Lets find ‘Simpson, Kim’ in the Employee table  Key values are LastName, FirstName  Intermediate node: Morgan, Beverly Potter, Henry Ramsay, Delores Smith, Sam
  • 31.
    Traversing the Tree Lets find ‘Simpson, Kim’ in the Employee table  Key values are LastName, FirstName  Leaf node: Ramsay, Delores, (1992-04-23, Sr Engineer) Rumsfeld, Dan, (2016-03-22, Engineer) Sanderson, Betty, (2019-08-19, Jr Engineer) Simpson, Kim, (2014-01-04, Engineer)
  • 32.
    Why are somequeries slow? select FirstName, LastName from Person.Person where LastName = 'Simpson' vs select FirstName, LastName from Person.Person where FirstName = 'Kim'
  • 33.
    Nonclustered Index id index_id >1 root_page Previous Page Next Page Index Rows Previous Page Next Page Index Rows Previous Page Next Page Index Rows Previous Page Next Page Index Rows Previous Page Next Page Key Values Previous Page Next Page Key Values Previous Page Next Page Key Values Root Node Intermediate Node Leaf Node
  • 34.
    Nonclustered Index on Heap PreviousPage Next Page Key Values Nonclustered Index Leaf Node IAM Page Header Data Rows Page Header Data Rows Page Header Data Rows
  • 35.
    Nonclustered Index on ClusteredIndex Previous Page Next Page Key Values Nonclustered Index Leaf Node Previous Page Next Page Index Rows Previous Page Next Page Index Rows Previous Page Next Page Index Rows Previous Page Next Page Index Rows Previous Page Next Page Data Rows Previous Page Next Page Data Rows Previous Page Next Page Data Rows
  • 36.
    Finding Data Means ReadingPages Indexes = Good
  • 37.
    Indexable Queries: FindingData  Check that the columns tested by your WHERE clause are in an index  Ditto for: JOIN conditions, HAVING clause  Don’t hide the column values to make them unusable by the index
  • 38.
    Implicit Indexes  PrimaryKey Constraints (PK)  Unique Key Constraints (AK) Both require uniqueness Your only option on table variables
  • 39.
    SARGable = SearchARGument ABLE  A SARGable predicate is one where SQL server can isolate the single value or range of index key values to process  Functions and operators can act like a black box around the column value, so the database engine can’t use it in an index and must scan -> Non-SARGable
  • 40.
    Are any functionsSARGable? Not many:  LIKE (without a leading %)  =, >, >=, <, <=, IN, BETWEEN  IS NULL  Some Date functions  User Defined Functions that are basically parameterized views
  • 41.
    Indexable Queries: ClusteringData  Check that all the columns you’re asking for are covered by the index  Pay attention to the order of key values in the index. Even non-SARGable conditions aren’t that bad if they’re grouped together  Key values of the Clustered Index will be on the leaf pages of your Nonclustered Index
  • 42.
    Indexable Queries: SortingData  Check that your sort order is supported by the index.  Ditto for: GROUP BY  It works in reverse, too
  • 43.