Statistics is a very crucial concept in SQL Server to understand why a specific query plan was chosen to execute your query. In this slideshow, I attempt to explain some basic concepts in SQL Server Statistics.
Got a question or comment? Write to me at love@withsqlserver.com
Happy DBA'ing!
3. Distribution of data
When we have millions of rows in the tables, it really matters (to the database!) on how the
data is distributed. For example, it is useful to know that 35% of the employees (from the data in
an Employee table) work from France, 28% from Germany and so on.
Distribution of data really matters!
http://www.sqlserverapp.com/
4. Why does it matter so much?
So why is all this obsession about data distribution?!
Because only then SQL Server can optimize query execution.
Before it runs a query, SQL Server needs to ‘estimate’ how much of data is being fetched.
For example, ‘Is the query fetching about 10% of the data from the table?’
Or ‘Is it getting almost 90% of the data in the table?’
Based on this, the SQL Server Query Optimizer makes several choices on how to
execute the query.
Say, for example, it will decide whether or not to use a specific index during the execution.
http://www.sqlserverapp.com/
6. SQL Server uses a really cool way to track data distribution.
And thats’s what we call
Statistics
Statistics … a crucial SQL technique
http://www.sqlserverapp.com/
7. sys.stats
is our hero here!
Not that you are going to look into its data much.
But worth getting a basic hang on it when you find some time
or during your coffee break!
DMV for Statistics
https://docs.microsoft.com/en-us/sql/relational-databases/system-catalog-views/sys-stats-transact-sql
http://www.sqlserverapp.com/
8. Each Statistics object is created for one or more of the columns
in the tables and indexed views in SQL Server.
SQL Server maintains a histogram depicting the distribution of values.
Statistics objects
http://www.sqlserverapp.com/
Tidbit : What is a Histogram?
A Histogram groups data into ranges and helps display how much data is in each range.
9. When the automatic creation/updating of Statistics is enable in SQL Server, SQL Server takes a wise
call on whether it is really necessary to update the Statistics before a query is run.
It takes this call based on how much the data in the related tables have changed since the last Statistics
update.
When are they created?
http://www.sqlserverapp.com/
10. While the concept of Statistics is really cool, we need to keep in mind that it has a performance cost.
Having to maintain what distribution of data is present in every table and indexed view is not trivial.
While the need to keep the Statistics up-to-date is important, there is a question of how much up-to-
date it needs to be. Updating Statistics for, say, every insert/delete that happens in a table might be
way too costly. However, updating too less might also turn out bad because then queries will be
executed (i.e. query plans would be chosen by the Query Optimizer) using the old values in Statistics.
When are they created?
http://www.sqlserverapp.com/
12. Yes!
But again, be wary of the performance implication in maintaining a Statistic object.
Can you create a Statistic object?
http://www.sqlserverapp.com/
13. This is possible too. You can run an update of the Statistics objects when you want.
You can use the Stored Procedure sp_updatestats
Updating Statistics at your will
http://www.sqlserverapp.com/
14. This is possible too. You can run an update of the Statistics objects when you want.
You can use the Stored Procedure sp_updatestats
Updating Statistics at your will
http://www.sqlserverapp.com/
15. Updating Statistics at your will
http://www.sqlserverapp.com/
Be wary that updating
Statistics will lead to
recompilation of your queries!
16. Happy
DBA’ing!
See you soon with another interesting SQL Server concept!
Until then …
Referenced from MSDN
iKosmik
http://www.sqlserverapp.com/
Follow us to get notified on SQL Server concepts and tidbits