Successfully reported this slideshow.

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

Why & how to optimize sql server for performance from design to query

  1. 1. Why & How to optimize SQL Server for performance from design to query Antonios Chatzipavlis Software Architect , Development Evangelist, IT Consultant MCT, MCITP, MCPD, MCSD, MCDBA, MCSA, MCTS, MCAD, MCP, OCA
  2. 2. Objectives • Why is Performance Tuning Necessary? • How to Optimize SQL Server for performance • Optimizing Database Design • Optimizing Queries for performance. • Optimizing an Indexing Strategy • How to troubleshoot SQL Server • Define and implement monitoring standards for database servers and instances. • Troubleshoot database server & database performance issues. • Troubleshoot SQL Server connectivity issues. • Troubleshoot SQL Server concurrency issues. 2
  3. 3. Why is performance tuning necessary? 3
  4. 4. Why is Performance Tuning Necessary? • Allowing your system to scale • Adding more customers • Adding more features • Improve overall system performance • Save money but not wasting resources • The database is typically one of the most expensive resources in a datacenter 4
  5. 5. General Scaling Options (1) • Scaling SQL Server with Bigger Hardware • • • • Purchase a larger server, and replace the existing system. Works well with smaller systems. Cost prohibitive for larger systems. Can be a temporary solution. • Scaling SQL Server with More Hardware • Purchase more hardware and split or partition the database. • Partitioning can be either vertical or horizontal • Vertical: Split the databases based on a specific demographic such as time zone or zip code. • Horizontal: Split components out of one database into another 5
  6. 6. General Scaling Options (2) • Scaling SQL Server without adding hardware • • • • • • • • Adjusting and rewriting queries. Adding indexes. Removing indexes. Re-architecting the database schema. Moving things that shouldn’t be in the database. Eliminating redundant work on the database. Caching of data. Other performance tuning techniques. 6
  7. 7. How to Optimize SQL Server for performance Optimizing Database Design 7
  8. 8. Performance Optimization Model Server Tuning Locking Indexing Query Optimization Schema Design 8
  9. 9. Schema Design Optimization • • • • Normalization Denormalization Generalization Data Abstraction Layer 9
  10. 10. Normalization In this process you organize data to minimize redundancy, which eliminates duplicated data and logical ambiguities in the database. Normal Form Description First Every attribute is atomic, and there are no repeating groups Second Complies with First Normal Form, and all non-key columns depend on the whole key Third Complies with Second Normal Form, and all non-key columns are non-transitively dependent upon the primary key 10
  11. 11. Denormalization In this process you re-introduce redundancy to the database to optimize performance. When to use denormalization:  To pre-aggregate data  To avoid multiple/complex joins When not to use denormalization:  To prevent simple joins  To provide reporting data  To prevent same row calculations 11
  12. 12. Generalization In this process you group similar entities together into a single entity to reduce the amount of required data access code. Use generalization when: A large number of entities appear to be of the same type  Multiple entities contain the same attributes Do not use generalization when:  It results in an overly complex design that is difficult to manage 12
  13. 13. How to Optimize SQL Server for performance Optimizing Queries for performance 13
  14. 14. Key Measures for Query Performance Key factors for query performance:  Resources  Time used to execute the query required for query execution SQL Server tools to measure query performance:  Performance  SQL Monitor Server Profiler 14
  15. 15. Useful Performance Counters SQLServer:Access Methods Range Scans/sec. Measures the number of qualified range scans through indexes in the last second. Full Scans/sec. Measures the number of unrestricted full scans in the last second. Index Searches/sec. Measures the number of index searches in the last second. Table Lock Escalations/sec. Measures the number of lock escalations on a table. Worktables Created/sec. Measures the number of worktables created in the last second. 15
  16. 16. Useful Performance Counters SQLServer:SQL Statistics Batch Requests/sec. Measures the number of Transact-SQL command batches received per second. High batch requests mean good throughput. SQL Compilations/sec. Measures the number of SQL compilations per second. This value reaches a steady state after SQL Server user activity is stable. SQL Re-Compilations/sec. Measures the number of SQL recompiles per second. 16
  17. 17. Useful Performance Counters SQLServer:Databases Transactions/sec. Measures the number of transactions started for the database in the last second 17
  18. 18. Useful Performance Counters SQLServer:Transactions Longest Transaction Running Time. Measures the length of time in seconds since the start of the transaction that has been active longer than any other current transaction. If this counter shows a very long transaction, you can use sys.dm_tran_active_transactions() to identify the transaction. Update conflict ratio. Measures the percentage of those transactions using the snapshot isolation level that have encountered update conflicts within the last second. 18
  19. 19. Useful Performance Counters SQLServer:Locks Average Wait Time (ms). Measures the average wait time for each lock request that resulted in a wait. Lock Requests/sec. Measures the number of locks and lock conversions per second. Lock Wait Time (ms). Measures the total wait time for locks in the last second. Lock Waits/sec. Measures the number of lock requests per second that required the caller to wait. 19
  20. 20. Useful SQL Profiler Events • Stored Procedures category: • RPC:Completed occurs when a remote procedure call has completed. SP:Completed occurs when a stored procedure has completed. • SP:StmtCompleted occurs when a T-SQL statement in a SP has completed. • • TSQL category: • • • SQL:StmtCompleted SQL:BatchCompleted occurs when a Transact-SQL batch has completed. which occurs when a T-SQL statement has completed. Locks category: • Lock:Acquired occurs when a transaction acquires a lock on a resource. Lock:Released occurs when a transaction releases a lock on a resource. • Lock:Timeout occurs when a lock request has timed out because another transaction • holds a blocking lock on the required resource. 20
  21. 21. Guidelines for Identifying Locking and Blocking • Use Activity Monitor • Use SQL Server Profiler blocked process report • Watch for situations in which the same procedure executes in different amounts of time • Identify the transaction isolation level of the procedure 21
  22. 22. Customers Table Data Logical Execution of Query 22
  23. 23. Customers Table Data Orders Table Data Sample customerid city ANTON Athens CHRIS Salonica FANIS Athens NASOS Athens Orderid customerid 1 NASOS 2 NASOS 3 FANIS 4 FANIS 5 FANIS 6 CHRIS 7 NULL 23
  24. 24. Sample SELECT C.customerid, COUNT(O.orderid) AS numorders FROM dbo.Customers AS C LEFT OUTER JOIN dbo.Orders AS O ON C.customerid = O.customerid WHERE = 'Athens' GROUP BY C.customerid HAVING COUNT(O.orderid) < 3 ORDER BY numorders; Customerid numorders ANTON 0 NASOS 2 24
  25. 25. 1st Step - Cross Join FROM dbo.Customers AS C ... JOIN dbo.Orders AS O Customerid City Orderid customerid ANTON Athens 1 NASOS ANTON Athens 2 NASOS ANTON Athens 3 FANIS ANTON Athens 4 FANIS ANTON Athens 5 FANIS ANTON Athens 6 CHRIS ANTON Athens 7 NULL CHRIS Salonica 1 NASOS CHRIS Salonica 2 NASOS CHRIS Salonica 3 FANIS CHRIS Salonica 4 FANIS CHRIS Salonica 5 FANIS CHRIS Salonica 6 CHRIS CHRIS Salonica 7 NULL FANIS Athens 1 NASOS FANIS Athens 2 NASOS FANIS Athens 3 FANIS FANIS Athens 4 FANIS FANIS Athens 5 FANIS FANIS Athens 6 CHRIS FANIS Athens 7 NULL NASOS Athens 1 NASOS NASOS Athens 2 NASOS NASOS Athens 3 FANIS NASOS Athens 4 FANIS NASOS Athens 5 FANIS NASOS Athens 6 CHRIS NASOS Athens 7 NULL 25
  27. 27. 3rd Step - Apply OUTER Join FROM dbo.Customers AS C LEFT OUTER JOIN dbo.Orders AS O Customerid City Orderid customerid CHRIS Salonica 6 CHRIS FANIS Athens 3 FANIS FANIS Athens 4 FANIS FANIS Athens 5 FANIS NASOS Athens 1 NASOS NASOS Athens 2 NASOS ΑΝΤΟΝ Athens NULL NULL 27
  28. 28. 4th Step - Apply WHERE filter WHERE = 'Athens' Customerid City Orderid customerid FANIS Athens 3 FANIS FANIS Athens 4 FANIS FANIS Athens 5 FANIS NASOS Athens 1 NASOS NASOS Athens 2 NASOS ΑΝΤΟΝ Athens NULL NULL 28
  29. 29. 5th Step - Apply Grouping GROUP BY C.customerid Customerid City Orderid customerid FANIS Athens 3 FANIS FANIS Athens 4 FANIS FANIS Athens 5 FANIS NASOS Athens 1 NASOS NASOS Athens 2 NASOS ΑΝΤΟΝ Athens NULL NULL 29
  30. 30. 6th Step - Apply Cube or Rollup 30
  31. 31. 7th Step - Apply HAVING Filter HAVING COUNT(O.orderid) < 3 Customerid City Orderid customerid NASOS Athens 1 NASOS NASOS Athens 2 NASOS ΑΝΤΟΝ Athens NULL NULL 31
  32. 32. 8th Step - Apply SELECT List SELECT C.customerid, COUNT(O.orderid) AS numorders Customerid numorders NASOS 2 ANTON 0 32
  33. 33. 9th Step - Apply DISTINCT 33
  34. 34. 10th Step - Apply ORDER BY Customerid numorders ANTON 0 NASOS 2 34
  35. 35. 11th Step - Apply TOP 35
  36. 36. How the Query Optimizer Processes Queries 36
  37. 37. Considerations to Take When Using Subqueries Select statement element Subquery results expression Subquery results single column table Subquery results data set Subquery returns single scalar value. Subquery returns single column of values. Subquery returns multiple columns. The subquery’s data set is used as a virtual table within the outer-query. The subquery’s data set is used as a virtual table within the outer-query. Select list The subquery’s result is used as an expression supplying the value for the column. FROM clause (derived table) This is the only location where a subquery can act as a table. The subquery’s data set is used as a virtual table within the outer-query. WHERE clause, comparison predicates x {=, >, <, >=, <=, <>} (). The predicate is true if the test value compares with the subquery’s scalar value and returns true. WHERE clause, IN predicate x IN (). The predicate is true if the test value is equal to the value returned by the subquery. The predicate is true if the test value is found within the values returned by the subquery. WHERE clause, EXISTS predicate EXISTS (x). The predicate is true if the subquery returns at least one row. The predicate is true if the subquery returns at least one row. The predicate is true if the subquery returns at least one row. Consider queries on a case-by-case basis 37
  38. 38. Top 10 for Building Efficient Queries
  39. 39. 39
  40. 40. Favor set-based logic over procedural or cursor logic • The most important factor to consider when tuning queries is how to properly express logic in a set-based manner. • Cursors or other procedural constructs limit the query optimizer’s ability to generate flexible query plans. • Cursors can therefore reduce the possibility of performance improvements in many situations 40
  41. 41. 41
  42. 42. Test query variations for performance • The query optimizer can often produce widely different plans for logically equivalent queries. • Test different techniques, such as joins or subqueries, to find out which perform better in various situations. 42
  43. 43. 43
  44. 44. Avoid query hints. • You must work with the SQL Server query optimizer, rather than against it, to create efficient queries. • Query hints tell the query optimizer how to behave and therefore override the optimizer’s ability to do its job properly. • If you eliminate the optimizer’s choices, you might limit yourself to a query plan that is less than ideal. • Use query hints only when you are absolutely certain that the query optimizer is incorrect. 44
  45. 45. 45
  46. 46. Use correlated subqueries to improve performance. --Using the query optimizer is able to integrate • Since a LEFT JOIN SELECT a.parent_key FROM parent_table aa variety of subqueries into the main query flow in LEFT JOIN child_table b help in various query tuning ways, subqueries might ON a.parent_key = b.parent_key situations. WHERE B.parent_key IS NULL • Subqueries can be especially useful in situations in which you create a join to a table only to verify the existence of correlated rows. For better performance, --Using a NOT EXISTS replace these kinds of joins with correlated subqueries SELECT a.parent_key FROM parent_table a that make use of the EXISTS operator WHERE NOT EXISTS (SELECT * FROM child_table b WHERE a.parent_key =b.parent_key) 46
  47. 47. 47
  48. 48. Avoid using a scalar user-defined function in the WHERE clause. • Scalar user-defined functions, unlike scalar subqueries, are not optimized into the main query plan. • Instead, you must call them row-by-row by using a hidden cursor. • This is especially troublesome in the WHERE clause because the function is called for every input row. • Using a scalar function in the SELECT list is much less problematic because the rows have already been filtered in the WHERE clause. 48
  49. 49. 49
  50. 50. Use table-valued user-defined functions as derived tables. CREATE FUNCTION Sales.fn_SalesByStore (@storeid • In contrast to scalar user-defined functions, table-int) RETURNSfunctions are often helpful from a performance valued TABLE AS RETURN ( point of view when you use them as derived tables. SELECT P.ProductID, P.Name, • The query processor evaluates a derived table only SUM(SD.LineTotal) AS 'YTD Total' once per query. FROM Production.Product AS P • IfJOIN embed the logic in a table-valued user-defined you Sales.SalesOrderDetail AS SD function, you can encapsulate and reuse it for other ON SD.ProductID = P.ProductID • queries. JOIN Sales.SalesOrderHeader AS SH ON SH.SalesOrderID = SD.SalesOrderID WHERE SH.CustomerID = @storeid GROUP BY P.ProductID, P.Name ) 50
  51. 51. 51
  52. 52. Avoid unnecessary GROUP BY columns • Use a subquery instead. SELECT p1.ProductSubcategoryID, • The process of grouping rows becomes more expensive p1.Name as you add more columns to the GROUP BY list. FROM Production.Product p1 • If your query has few column aggregations but many WHERE p1.ListPrice > non-aggregated grouped columns, you might be able ( SELECT AVG (p2.ListPrice) to refactor it by using a correlated scalar subquery. FROM Production.Product p2 • This will result in less work for grouping in the query WHERE and therefore possibly better overall query = p1.ProductSubcategoryID performance. p2.ProductSubcategoryID) 52
  53. 53. 53
  54. 54. Use CASE expressions to include variable logic in a query • The CASE expression is one of the most powerful logic tools available to T-SQL programmers. • Using CASE, you can dynamically change column output on a row-by-row basis. • This enables your query to return only the data that is absolutely necessary and therefore reduces the I/O operations and network overhead that is required to assemble and send large result sets to clients. 54
  55. 55. 55
  56. 56. Divide joins into temporary tables when you query very large tables. • The query optimizer’s main strategy is to find query plans that satisfy queries by using single operations. • Although this strategy works for most cases, it can fail for larger sets of data because the huge joins require so much I/O overhead. • In some cases, a better option is to reduce the working set by using temporary tables to materialize key parts of the query. You can then join the temporary tables to produce a final result. • This technique is not favorable in heavily transactional systems because of the overhead of temporary table creation, but it can be very useful in decision support situations. 56
  57. 57. 57
  58. 58. Refactoring Cursors into Queries. • Rebuild logic as multiple queries • Rebuild logic as a user-defined function • Rebuild logic as a complex query with a case expression 58
  59. 59. Refactoring Cursor 59
  60. 60. Stored Procedures and Views Best Practices 60
  61. 61. Stored Procedures Best Practices • Avoid stored procedures that accept parameters for table names • Use the SET NOCOUNT ON option in stored procedures • Limit the use of temporary tables and table variables in stored procedures • If a stored procedure does multiple data modification operations, make sure to enlist them in a transaction. • When working with dynamic T-SQL, use sp_executesql instead of the EXEC statement 61
  62. 62. Views Best Practices • • • • • Use views to abstract complex data structures Use views to encapsulate aggregate queries Use views to provide more user-friendly column names Think of reusability when designing views Avoid using the ORDER BY clause in views that contain a TOP 100 PERCENT clause. • Utilize indexes on views that include aggregate data 62
  63. 63. How to Optimize SQL Server for performance Optimizing an Indexing Strategy 63
  64. 64. Index Architecture • Clustered • Nonclusted 64
  65. 65. Types of Indexes • Clustered • Nonclustered • Unique • Index with included column • Indexed view • Full-text • XML 65
  66. 66. Guidelines for designing indexes • Examine the database characteristics. For example, your indexing strategy will differ between an online transaction processing system with frequent data updates and a data warehousing system that contains primarily read-only data. • Understand the characteristics of the most frequently used queries and the columns used in the queries. For example, you might need to create an index on a query that joins tables or that uses a unique column for its search argument. • Decide on the index options that might enhance the performance of the index. Options that can affect the efficiency of an index include FILLFACTOR and ONLINE. • Determine the optimal storage location for the index. You can choose to store a nonclustered index in the same filegroup as the table or on a different filegroup. If you store the index in a filegroup that is on a different disk than the table filegroup, you might find that disk I/O performance improves because multiple disks can be read at the same time. • Balance read and write performance in the database. You can create many nonclustered indexes on a single table, but it is important to remember that each new index has an impact on the performance of insert and update operations. This is because nonclustered indexes maintain copies of the indexed data. Each copy of the data requires I/O operations to maintain it, and you might cause a reduction in write performance if the database has to write too many copies. You must ensure that you balance the needs of both select queries and data updates when you design an indexing strategy. • Consider the size of tables in the database. The query processor might take longer to traverse the index of a small table than to perform a simple table scan. Therefore, if you create an index on a small table, the processor might never use the index. However, the database engine must still update the index when the data in the table changes . • Consider the use of indexed views. Indexes on views can provide significant performance gains when the view contains aggregations, table joins, or both. 66
  67. 67. Nonclustered Index (do’s & don’ts) • Create a nonclustered index for columns used for: • Predicates • Joins • Aggregation • Avoid the following when designing nonclustered indexes: • Redundant indexes • Wide composite indexes • Indexes for one query • Nonclustered indexes that include the clustered index 67
  68. 68. Clustered Indexes (do’s & don’ts) • Use clustered indexes for: • Range queries • Primary key queries • Queries that retrieve data from many columns • Do not use clustered indexes for: • Columns that have frequent changes • Wide keys 68
  69. 69. Database Engine Tuning Advisor 69
  70. 70. How to troubleshoot SQL Server Define and implement monitoring standards for database servers and instances. 70
  71. 71. Monitoring Stages Stage 1 Monitoring the database environment Narrowing down a performance issue to a particular database environment area Stage 2 Stage 3 Narrowing down a performance issue to a particular database environment object Stage 4 Troubleshooting individual problems Stage 5 Implementing a solution 71
  72. 72. How to Optimize SQL Server for performance Troubleshoot database server and database performance issues. 72
  73. 73. Monitoring the database environment • You must collect a broad range of performance data. • The monitoring system must provide you with enough data to solve the current performance issues. • You must set up a monitoring solution that collects data from a broad range of sources. • Active data, you can use active collection tools • System Monitor, • Error Logs, • SQL Server Profiler • Inactive data you can use sources • Database configuration settings, • Server configuration settings, • Metadata from SQL Server installation and databases. 73
  74. 74. Narrowing Down a Performance Issue to a Particular Database • Analyze the performance data that you collect • Identify the performance issues. • The combination of data that you have gathered helps you identify database areas on which you need to concentrate. • Revisit the monitoring solution to gather additional data. This often provides clues that you can use to define the scope of the investigation and focus on a particular database object or server configuration. • After identifying the object, you can begin troubleshooting performance issues and solve the problem. 74
  75. 75. Guidelines for Auditing and Comparing Test Results • Scan the outputs gathered for any obvious performance issues. • Automate the analysis with the use of custom scripts and tools. • Analyze data soon after it is collected. • Performance data has a short life span, and if there is a delay, the quality of the analysis will suffer. • Do not stop analyzing data when you discover the first set of issues. • Continue to analyze until all performance issues have been identified. • Take into account the entire database environment when you analyze performance data. 75
  76. 76. Monitoring Tools • • • • • SQL Server Profiler System Monitor SqlDiag DMVs for Monitoring Performance Data Collector 76
  77. 77. SQL Server Profiler guidelines • Schedule data tracing for peak and nonpeak hours • Use Transact-SQL to create your own SQL Server Profiler traces to minimize the performance impact of SQL Server Profiler. • Do not collect the SQL Server Profiler traces directly into a SQL Server table. • After the trace has ended, use fn_trace_gettable function to load the data into a table. • Store collected data on a computer that is not the instance that you are tracing. 77
  78. 78. System Monitor guidelines • Execute System Monitor traces at different times during the week, month. • Collect data every 36 seconds for a week. • If the data collection period spans more than a week, set the collection time interval in the range of 300 to 600 seconds. • Collect the data in a comma-delimited text file. You can load this text file into SQL Server Profiler for further analysis. • Execute System Monitor on one server to collect the performance data of another server. 78
  79. 79. SQLDIAG • Is a general purpose diagnostics collection utility • Can be run as a console application or as a service. • Is intended to expedite and simplify diagnostic information gathering for Microsoft Customer Support Services. • Collect the following types of diagnostic information: • • • • • Windows performance logs Windows event logs SQL Server Profiler traces SQL Server blocking information SQL Server configuration information 79
  80. 80. SQLDIAG 80
  81. 81. DMVs for Monitoring • sys.dm_os_threads Returns a list of all SQL Server Operating System threads that are running under the SQL Server process. • sys.dm_os_memory_pools Returns a row for each object store in the instance of SQL Server. You can use this view to monitor cache memory use and to identify bad caching behavior • sys.dm_os_memory_cache_counters Returns a snapshot of the health of a cache, provides run-time information about the cache entries allocated, their use, and the source of memory for the cache entries. • sys.dm_os_wait_stats Returns information about all the waits encountered by threads that executed. You can use this aggregated view to diagnose performance issues with SQL Server and also with specific queries and batches. • sys.dm_os_sys_info Returns a miscellaneous set of useful information about the computer, and about the resources available to and consumed by SQL Server. 81
  82. 82. Performance Data Collector • Management Data Warehouse • Performance Data Collection • • • • Performance data collection components System collection sets User-defined collection sets Reporting • Centralized Administration: Bringing it all together Performance Data Collection and Reporting 82
  83. 83. Performance Data Collector 83
  84. 84. How to troubleshoot SQL Server Troubleshoot SQL Server connectivity issues. 84
  85. 85. Areas to Troubleshoot for Common Connectivity Issues • Server • • Service pack • Database configuration • • Surface area configuration policies Account status Client and server • • • Network protocols Net library Other network devices • Firewall port configuration • DNS entries 85
  86. 86. SQL Server Endpoints Server endpoints Enable connection over network with client  Enable configuration based on TCP port numbers  Are managed by statements:  CREATE ENDPOINT  ALTER ENDPOINT  DELETE ENDPOINT  Types of endpoint SOAP  TSQL  Service Broker  Database Mirroring  86
  87. 87. How to troubleshoot SQL Server Troubleshoot SQL Server concurrency issues. 87
  88. 88. Transaction Isolation Levels • Read uncommitted • Read committed • Repeatable read • Snapshot • Serializable 88
  89. 89. Guidelines to Reduce Locking and Blocking • Keep logical transactions short • Avoid cursors • Use efficient and well-indexed queries • Use the minimum transaction isolation level required • Keep triggers to a minimum 89
  90. 90. Minimizing Deadlocks • • • • • Access objects in the same order. Avoid user interaction in transactions. Keep transactions short and in one batch. Use a lower isolation level. Use a row versioning–based isolation level. • Set the READ_COMMITTED_SNAPSHOT database option ON to enable read-committed transactions to use row versioning. • Use snapshot isolation. • Use bound connections. • Bound connections allow two or more connections to share the same transaction and locks. Bound connections can work on the same data without lock conflicts. Bound connections can be created from multiple connections within the same application, or from multiple applications with separate connections. Bound connections make coordinating actions across multiple connections easier. For more information see Books Online  90
  91. 91. What Are SQL Server Latches? • Latches are: • Objects used to synchronize data pages • Released immediately after the operation • Latch waits: • Occur when a requested latch is held by another thread • Can be monitored with the counters: • • Latch Waits/sec • • Average Latch Wait Time (ms) Total Latch Wait Time (ms) Increase under memory or disk I/O pressure 91
  92. 92. Q&A 92
  93. 93. Thank you 93