Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Getting Better Query Plans by Improving SQL's Estimates

1,431 views

Published on

You’ve been writing T-SQL queries for a few years now, and when you have performance issues, you’ve been updating stats and using OPTION (RECOMPILE). It’s served you well, but every now and then, you hit a problem you can’t solve. Your data’s been growing larger, your queries are taking longer to run, and you’re starting to wonder: how can I start getting better query plans?

The secret is often comparing the query plan’s estimated number of rows to actual number of rows. If they’re different, it’s up to you – not the SQL Server engine – to figure out why the guesses are wrong. To improve ’em, you can change your T-SQL, the way the data’s structured and stored, or how SQL Server thinks about the data.

This session won’t fix every query – but it’ll give you a starting point to understand what you’re looking at, and where to go next as you learn about the Cardinality Estimator.

Published in: Software
  • Be the first to comment

Getting Better Query Plans by Improving SQL's Estimates

  1. 1. Cardinality Estimation Estimated vs actual fight to the death. Before we start, print page 1 of this PDF: BrentOzar.com/go/engine
  2. 2. 99-05: dev, architect, DBA 05-08: DBA, VM, SAN admin 08-10: MCM, Quest Software Since: consulting DBA www.BrentOzar.com Help@BrentOzar.com
  3. 3. Typical weekend errand Go to Binny’s. • If they have our champagne, buy it all. Then go to Whole Foods. • If you picked up champagne, get a lobster. • Otherwise, get one pound of salmon.
  4. 4. Bad idea Go to Binny’s. • If they have our champagne, buy it all. Then go to Whole Foods. • If you picked up champagne, get one lobster for each bottle of champagne. • Otherwise, get one pound of salmon.
  5. 5. Execution plans are like errands.
  6. 6. Typical query errand Go to the Users table. • Find the most popular Location. Go to back to the Users table. • Get all of the people in that Location • Sort them by DisplayName
  7. 7. Cardinality estimation is hard.
  8. 8. When SQL Server’s estimates are reasonably close to actuals, you’re getting a good plan. It may not be a fast plan, but it’ll accurately reflect the amount of work. (You could reduce work by changing the query or the indexes.)
  9. 9. When SQL Server’s estimates are nowhere near actuals (like 100x-1,000x off) you’re usually getting a bad plan.
  10. 10. Easy cardinality: GROUP BY Location Let’s start thinking like the engine again…
  11. 11. This demo uses MAXDOP 1. It’s not that it’s a good idea. I just want to keep this execution plan really simple. Parallelism makes this plan more complex to read. Don’t worry, the tuning answer isn’t “use more cores.”
  12. 12. Build an execution plan for this. SELECT Location, COUNT(*) FROM dbo.Users GROUP BY Location ORDER BY COUNT(*) DESC OPTION (MAXDOP 1); And tell me how much memory you’ll need (to keep it simple, how many rows will you handle?)
  13. 13. Let’s see what SQL estimates. Get the estimated execution plan. This will collect the estimated execution plan. You could also press CTRL + L to gather the estimated execution plan.
  14. 14. Right to left, top to bottom Read the way the arrows point
  15. 15. Step 1: Clustered index scan
  16. 16. Another way: check the arrows The arrows coming out of most operators are a faster/easier way to check row estimations.
  17. 17. Step 2: Hash Match Aggregate
  18. 18. Step 3: Compute Scalar
  19. 19. Step 4: Sort
  20. 20. Step 5: SELECT
  21. 21. Cost Each statement has a total cost. A statement’s cost is the sum of the cost of all operators in a statement. Cost is just an arbitrary metric of the estimated cost to get work done. We call that Query Bucks.
  22. 22. BrentOzar.com/go/querybucks
  23. 23. BrentOzar.com/go/querybucks
  24. 24. To discover if our estimates are wrong, we have to run it and get the actual plan.
  25. 25. Getting an actual plan The only way to get an actual plan is to capture it at the exact moment when the query finishes, like with: • SSMS, with actual plans turned on • Extended Events • Profiler trace You can’t get the actual plan from the DMVs after the query finishes. You’ll see why.
  26. 26. Estimated plan top, actual bottom Spot the difference between them?
  27. 27. Hover over the hash match Estimated plan at left, actual plan at right. The actual plan adds actual row counts.
  28. 28. Every time a query runs, these can change: • Warnings • Row counts • Spills • Numbers of executions • Memory grants • Wait stats • Join types (in adaptive plans) The shape of the est vs actual plan is the same, but lots of metrics can change.
  29. 29. Step 1: Clustered Index Scan Estimates vs actuals are right. No surprise: it’s just a count of all the rows in the table, and the table contents aren’t changing.
  30. 30. Arrows: handy shortcuts Hover the mouse over the arrows to see mismatches. Doesn’t show actual data size, though.
  31. 31. Step 2: Hash match aggregate Now the estimate starts to matter: SQL Server needs memory to do this, and it only estimates enough memory to handle the estimated number of rows. When it runs out, it spills to TempDB.
  32. 32. How many Locations came out? We have 2 problems here: • Our estimates were wrong • Which means our memory estimates are wrong for the rest of the plan, too
  33. 33. How accurate were your own estimates?
  34. 34. SQL Server did the best it could. There’s no magic here. No person or system could possibly get this estimate right without changing something about the database, like storing the number of locations. More on that in a minute.
  35. 35. Step 3: Compute Scalar
  36. 36. Step 4: Sort Our bad estimates strike again: we didn’t estimate enough memory to do the sort. So now we spill the data to disk AGAIN.
  37. 37. Oops. Estimation accuracy matters a lot.
  38. 38. Step 5: SELECT
  39. 39. Every time the query runs… We’re going to underestimate row counts. We’re going to underallocate memory. We’re going to write to TempDB. We’re going to burn more CPU time sorting.
  40. 40. So how do we fix it? Updating the existing statistics OPTION (RECOMPILE) Switching back to the 2012 Cardinality Estimator Using 2017’s Adaptive Memory Grants Adding a statistic (on what columns?) Adding an index (on what columns?)
  41. 41. Does updating stats help? No – we don’t have stats on the Location column. We can update our brains out, but nothing will change.
  42. 42. OPTION (RECOMPILE)? No, that doesn’t help either: we built a fresh plan for this, and it was still wrong.
  43. 43. The Cardinality Estimator (CE) The SQL Server engine code that guesses how many rows will come back from operations SQL Server 2005-2012: stayed the same SQL Server 2014+: new CE available if you set your database compatibility level to 2014 or higher
  44. 44. No winners here – they both lose. Compatibility level 2012: Compat 2014/16/17:
  45. 45. The new CE was just new. It’s not magic. It’s not even new anymore.
  46. 46. SQL 2017 helps – kinda. Adaptive Memory Grants: SQL Server tracks when this happens, and starts adjusting memory grants over time. Unfortunately, right now it only works in batch mode: meaning, we need a columnstore index in the query. That’s not the right answer to this tuning problem.
  47. 47. Will adding a statistic help? CREATE STATISTICS Stat_Location ON dbo.Users(Location);
  48. 48. Will adding a statistic help? CREATE STATISTICS Stat_Location ON dbo.Users(Location); Even if it does help, you still have to sort every time:
  49. 49. Will adding an index help? CREATE INDEX IX_Location ON dbo.Users(Location);
  50. 50. Will adding an index help? CREATE INDEX IX_Location ON dbo.Users(Location); Getting better:
  51. 51. And check out those estimates. We know how many rows will come out Rows are sorted: no hash match, now stream aggregate We still have to sort by most popular locations though
  52. 52. Sort: still spills! Look at estimated vs actual. We knew we’d get 110k rows. But SQL Server still didn’t allocate enough memory.
  53. 53. Query tuning involves Each plan operator tries to predict: • How much data will come in • How much resources we’ll need in order to perform the required work in this operator • How much data will go out to the next operator Your job as a query tuner: figure out where those are going wrong, and make changes to make SQL Server’s life easier.
  54. 54. When SQL Server’s estimates are reasonably close to actuals, you’re getting a good plan. It may not be a fast plan, but it’ll accurately reflect the amount of work. (You could reduce work by changing the query or the indexes.)
  55. 55. Query tuning involves Each plan operator tries to predict: • How much data will come in • How much resources we’ll need in order to perform the required work in this operator • How much data will go out to the next operator Your job as a query tuner: figure out where those are going wrong, and make changes to make SQL Server’s life easier. Changed with the index
  56. 56. Query tuning involves Each plan operator tries to predict: • How much data will come in • How much resources we’ll need in order to perform the required work in this operator • How much data will go out to the next operator Your job as a query tuner: figure out where those are going wrong, and make changes to make SQL Server’s life easier. We can change this too
  57. 57. Here’s the query again. SELECT Location, COUNT(*) FROM dbo.Users GROUP BY Location ORDER BY COUNT(*) DESC OPTION (MAXDOP 1); Do we really need all locations every time this runs? Or can we paginate the data in our application?
  58. 58. “Buy it all” is a bad errand. Go to Binny’s. • If they have our champagne, buy it all. Then go to Whole Foods. • If you picked up champagne, get a lobster. • Otherwise, get one pound of salmon.
  59. 59. Let’s try just the top 100. SELECT TOP 100 Location, COUNT(*) FROM dbo.Users GROUP BY Location ORDER BY COUNT(*) DESC OPTION (MAXDOP 1);
  60. 60. No spills on the sort. And now, the only yellow bang is on the SELECT. SQL Server is complaining that we granted TOO MUCH memory – and it’s just 1MB.
  61. 61. The sorts are different. Sorting all rows, and keeping them all Sorting all rows, but only keeping N of them And the sort algorithm even changes depending on what N is! https://www.brentozar.com/archive/2017/09/mu ch-can-one-row-change-query-plan-part-1/
  62. 62. What we learned so far To get accurate estimates, the Cardinality Estimator (CE) needs statistics. Indexes get you those statistics. You can’t just tune the queries in isolation: the right indexes are a required foundation. Even when estimates are completely accurate, you still may not get the memory you might want to join/sort everything in memory. (Especially for large real-world loads.)
  63. 63. Now let’s do a 2-part errand. A one, and a two
  64. 64. Queries usually multiply data. Go to Binny’s. • If they have our champagne, buy it all. Then go to Whole Foods. • If you picked up champagne, get one lobster for each bottle of champagne. • Otherwise, get one pound of salmon.
  65. 65. Build an execution plan for this. DECLARE @TopLocation NVARCHAR(100); SELECT TOP 1 @TopLocation = Location FROM dbo.Users WHERE Location <> '' GROUP BY Location ORDER BY COUNT(*) DESC; SELECT * FROM dbo.Users WHERE Location = @TopLocation ORDER BY DisplayName;
  66. 66. Build an execution plan for this. DECLARE @TopLocation NVARCHAR(100); SELECT TOP 1 @TopLocation = Location FROM dbo.Users WHERE Location <> '' GROUP BY Location ORDER BY COUNT(*) DESC; SELECT * FROM dbo.Users WHERE Location = @TopLocation ORDER BY DisplayName; Variables only store 1 row.
  67. 67. Build an execution plan for this. DECLARE @TopLocation NVARCHAR(100); SELECT TOP 1 @TopLocation = Location FROM dbo.Users WHERE Location <> '' GROUP BY Location ORDER BY COUNT(*) DESC; SELECT * FROM dbo.Users WHERE Location = @TopLocation ORDER BY DisplayName; We’re only going to put 1 value in @TopLocation. If we thought really hard, we could possibly even use statistics to predict what that value might be.
  68. 68. Build an execution plan for this. DECLARE @TopLocation NVARCHAR(100); SELECT TOP 1 @TopLocation = Location FROM dbo.Users WHERE Location <> '' GROUP BY Location ORDER BY COUNT(*) DESC; SELECT * FROM dbo.Users WHERE Location = @TopLocation ORDER BY DisplayName; But now the SELECT runs. How can it predict how many rows will return?
  69. 69. The whole plan is built at once. DECLARE @TopLocation NVARCHAR(100); SELECT TOP 1 @TopLocation = Location FROM dbo.Users WHERE Location <> '' GROUP BY Location ORDER BY COUNT(*) DESC; SELECT * FROM dbo.Users WHERE Location = @TopLocation ORDER BY DisplayName;
  70. 70. Estimates are built at once, too.
  71. 71. Of course the estimates are wrong. Boo, hiss
  72. 72. SQL Server compiles batches. It has to build an execution plan for the entire batch all at the same time. A stored procedure is a big batch. You can put OPTION (RECOMPILE) on statements, forcing SQL Server to build a new execution plan given what it knows so far.
  73. 73. New plan for this one statement
  74. 74. A different plan. At first, this might worse: a clustered index scan of the entire Users table, ignoring the Locations index. But it actually does less logical reads because we’re dealing with a lot of Users. And note – no spill on the sort.
  75. 75. It’s asking for a missing index. But it includes EVERY SINGLE FIELD. Over here in reality, we can’t usually create indexes like that. (If you can, great, do it – but notice AboutMe and its horrible datatype.)
  76. 76. What OPTION (RECOMPILE) does Forces SQL Server to stop and build a new plan Takes effect at the level where you put it Here, we’re recompiling a single statement because what comes out of the prior query changes EVERYTHING we do: • The index we use, and the way we use it • How much memory we need
  77. 77. Using recompile hints They help when SQL Server needs to reset expectations about what it’s about to do But you have to know where to put them Typically best used when: • You’re doing multi-step processing • The amount of data varies WIDELY • You can identify the point where things change dramatically, and stick the hint there
  78. 78. Possible fix: Combine queries
  79. 79. I got rid of the variable. Former query #1 Former query #2
  80. 80. But the plan is built all at once. SQL Server knows how many rows the CTE will produce: just 1, SELECT TOP 1. But it has no idea what the location will be. It doesn’t execute the CTE first, get the location, and then execute the query. This whole thing is done at once.
  81. 81. Read the plan right to left. SQL Server’s doing the same thing it did before in the 2-query process: first, it builds the list of most popular locations, takes the top 1, and then looks up the users in that location.
  82. 82. Is this estimate going to be right? Do we know how many locations we’ll find? Scan the list of locations
  83. 83. Is this estimate going to be right? Do we know how many locations the Sort will push out? Sort them by COUNT(*)
  84. 84. This one is a little trickier. We’re going to seek to that Location name. But 2 questions: 1. How many times are we going to seek to a Location? 2. How many rows are going to come out of the seek? Seek to that Location name
  85. 85. From “Think Like the Engine” A “seek” sounds like it’s only going to return 1 row. A “scan” sounds like it returns the whole table. But here’s what they really mean: Seek = start reading at one specific location Scan = start reading at either end of the object
  86. 86. Decoding it Seek predicate: jump to the Location that we found in the earlier operations, and start reading there. We just don’t know what that Location will be when we build the plan. We can’t possibly know.
  87. 87. Estimated rows The estimate is just garbage. SQL Server’s using the density vector (more on that in Think Like the Engine and Mastering Index Tuning.) It’s guessing based on how many rows the average Location has. But we’re asking for the biggest one.
  88. 88. Actual rows Estimates: 13 Actual: 37,810 These are way off, and it has a cascading effect on the rest of the plan.
  89. 89. This one is a little trickier. So now, as we move through the rest of the plan, we’re going to have issues. Est 13, actual 37,810
  90. 90. This one is a little trickier. For each User.Id we found, go get the rest of the fields that weren’t included in our Location index. Again, 2 questions: 1. How many times are we going to do this key lookup? 2. How many rows are going to come out of each one? Get the SELECT *
  91. 91. The entire popup I’ll zoom in on this one part… The important part
  92. 92. Dammit, Beavis FOR THE LOVE OF ALL THAT’S HOLY CAN YOU PLEASE PUT THE FIELDS IN SOME KIND OF ORDER AND BE CONSISTENT ABOUT WHETHER YOU PREFIX THINGS WITH ACTUAL OR NOT
  93. 93. Decoding it The label The value What it means Estimated Number of Executions 13 Number of Executions 37,810 Which is bad because each one produces a few logical reads, so we read more pages than the entire table Estimated Number of Rows 1 The number of rows that will come out of EACH OPERATION, which is really misleading Actual Number of Rows 37,810 The total number of rows that came out of ALL THE OPERATIONS
  94. 94. Is this estimate going to be right? The Sort needed to estimate how many rows it’d be dealing with, which affects our overall memory grant. Uh no
  95. 95. This CTE is a great example. Even in a small query like this: • SQL Server processes data in order, in steps • Can kinda be thought of as a stored procedure • Estimates can go wrong in any step Our job: • Read the plan right to left, top to bottom • Understand where estimates are going wrong • Help SQL Server make better estimations
  96. 96. Whew.
  97. 97. What we learned Cardinality estimation involves: • Predicting how many rows will come back • Guessing the contents of those rows to predict how many rows will come back from other operations Lots of ways to accomplish it with varying success: • Updating the existing statistics • OPTION (RECOMPILE) • Switching back to the 2012 Cardinality Estimator • 2017’s Adaptive Query Processing • Changing the database (indexes, stats) • Rewriting queries to break them up, or combine them
  98. 98. Wanna learn more? Mastering Index Tuning Mastering Query Tuning Mastering Server Tuning PASS Summit Pre-Con BrentOzar.com/training

×