When you pass in a query, how does SQL Server build the results? Time to role play: Brent will be an end user sending in queries, and you will play the part of the SQL Server engine. Using simple spreadsheets as your tables, you will learn how SQL Server builds execution plans, uses indexes, performs joins, and considers statistics.
This session is for DBAs and developers who are comfortable writing queries, but not so comfortable when it comes to explaining nonclustered indexes, lookups, and sargability.
3. I know, I hate killing trees.
But having these next 3 pages in your hand will help a lot
as we talk through the demos.
Print this 3-page PDF to follow along:
http://u.BrentOzar.com/engine.pdf
3
4. Brent Ozar
Consultant, Brent Ozar Unlimited
I make SQL Server faster and more reliable.
I created sp_Blitz® and the SQL Server First
Responder Kit, and I loves sharing knowledge at
BrentOzar.com. I hold a bunch of certifications and
awards including the rare Microsoft Certified Master.
You don’t care about any of that though.
Download the PDF: BrentOzar.com/go/enginepdf
/brentozar @brento brentozar
5. Agenda
When you pass in a query, how does SQL Server build the results? Time
to role play: Brent will be an end user sending in queries, and you will
play the part of the SQL Server engine. Using simple spreadsheets as
your tables, you will learn how SQL Server builds execution plans,
uses indexes, performs joins, and considers statistics.
This session is for DBAs and developers who are comfortable writing
queries, but not so comfortable when it comes to explaining
nonclustered indexes, lookups, sargability, fill factor, and corruption
detection.
27. Let’s add a sort.
SELECT Id
FROM dbo.Users
WHERE LastAccessDate > ‘2014/07/01’
ORDER BY LastAccessDate
28. Your execution plan
1. Shuffle through all of the pages,
writing down fields __________ for each record,
if their LastAccessDate > ‘2014/07/01’.
2. Sort the matching records by LastAccessDate.
31. Cost is up ~4x
We needed space to
write down our results,
so we got a memory grant
Order By:
32.
33. Memory is set when the query starts,
and not revised.
SQL Server has to assume other people
will run queries at the same time as you.
Your memory grant can change with
each time that you run a query.
You can’t always get
what you want.
35. Let’s get all the fields.
SELECT *
FROM dbo.Users
WHERE LastAccessDate > ‘2014/07/01’
ORDER BY LastAccessDate
36. Your execution plan
1. Shuffle through all of the pages,
writing down fields __________ for each record,
if their LastAccessDate > ‘2014/07/01’.
2. Sort the matching records by LastAccessDate.
38. Why does it suck?
Do we work harder to read the data?
Do we work harder to write the data?
Do we work harder to sort the data?
Do we work harder to output the data?
43. Let’s run it a few times.
SELECT *
FROM dbo.Users
WHERE LastAccessDate > ‘2014/07/01’
ORDER BY LastAccessDate;
GO 5
44. Your execution plan
1. Shuffle through all of the pages,
writing down all the fields for each record,
if their LastAccessDate > ‘2014/07/01’.
2. Sort the matching records by LastAccessDate.
3. Keep the output so you could reuse it the next
time you saw this same query?
48. Nonclustered indexes: copies.
Stored in the order we want
Include the fields we want
CREATE INDEX
IX_LastAccessDate_Id
ON dbo.Users(LastAccessDate, Id)
49. Let’s go simple again.
SELECT Id
FROM dbo.Users
WHERE LastAccessDate > ‘2014/07/01’
ORDER BY LastAccessDate;
50. Your execution plan
1. Grab IX_LastAccessDate and seek to 2014/07/01.
2. Read the Id’s out in order.
59. It covers the fields we need in this query.
But if we change the query…
That’s a covering index.
60. Let’s add a couple of fields.
SELECT Id, DisplayName, Age
FROM dbo.Users
WHERE LastAccessDate > ‘2014/07/01’
ORDER BY LastAccessDate;
61. One execution plan
1. Grab IX_LastAccessDate_Id, seek to 2014/07/01.
2. Write down the Id and LastAccessDate of
matching records.
3. Grab the clustered index (white pages), and look
up each matching row by their Id to get
DisplayName and Age.
63. For simplicity, I told you I created this index with the Id.
SQL Server always includes your clustering keys whether
you ask for ‘em or not because it has to join indexes.
That’s why SQL Server includes the key
64. Key lookup is required
when the index doesn’t
have all the fields we need.
Hover your mouse over the
key lookup and look for the
OUTPUT fields.
Small? Frequently used?
Add ‘em to the index.
DO NOT ADD A NEW INDEX.
Classic index
tuning sign
69. Decide which index to use
What order to process tables/indexes in
Whether to do seeks or scans
Guess how many rows will match your query
How much memory to allocate for the query
Statistics help SQL Server:
74. Automatic stats updates aren’t enough. Consider:
• http://Ola.Hallengren.com
• http://MinionWare.net/reindex
Typical strategy: weekly statistics updates
Updated statistics on an index invalidate query plans that
involve that index
• Affects your plan cache analysis
• Can cause unpredictable query plan changes
Keep statistics updated.
92. Clustered indexes hold all the fields*
Nonclustered indexes are light-weight* copies of the table
NC indexes reduce not just reads, but also CPU work
SQL Server caches raw data pages, not query output
Statistics drive seek vs scan, index choice, memory
Statistics aren’t the only part: cardinality estimation matters
Includes and seeks aren’t magically delicious
What we learned
We’re using the StackOverflow.com database as an example.
To get download it, go to http://BrentOzar.com/go/querystack.
These screenshots are from the 2016/03 export, which is ~100GB.
If you use a newer or older export, your numbers of pages may vary.
This session focuses on the Users table
Id – primary key, clustered index. It’s an identity, starts at 1 and goes into the millions.
The white paper you’re holding in your hands – that’s the clustered index.
It includes all of the fields on the table – sort of.
Notice the About Me? It’s an NVARCHAR(MAX), and may not fit on a row. SQL Server may store that off-row, on other pages, if people get really wordy in their about-me field. We’re not going to touch on off-row data here, but I just want you to know there’s an overhead to that. Same thing with XML, JSON.
For old-school tables, everything is stored in 8KB pages.
These pages are the same whether they’re in memory or on disk.
It’s the smallest unit of data SQL Server works with.
(Things are different for Hekaton and columnstore indexes, but we’re focusing on old-school tables today.)