TSQL Coding Guidelines

SQL Server Solution Architect EMEA Region at Pure Storage
Aug. 20, 2011

More Related Content


TSQL Coding Guidelines

  1. T-SQL Coding Guidelines
  2. Commenting code Code readability General best practise What Will Be Covered
  3. Comments and exception handling have been purposely omitted from code fragments in the interest of brevity, such that each fragment can fit onto one slide. Disclaimer
  4. Comments and exception handling have been purposely omitted from code fragments in the interest of brevity, such that each fragment can fit onto one slide. Disclaimer
  5. Code Readability
  6. All code should be self documenting. T-SQL code artefacts, triggers, stored procedures and functions should have a standard comment banner. Comment code at all points of interest, describe why and not what. Avoid in line comments. Comments
  7.  Comment banners should include:- Author details. A brief description of what the code does. Narrative comments for all arguments. Narrative comments for return types. Change control information. An example is provided on the next slide Comment Banners
  8. CREATE PROCEDURE /*===================================================================================* //* /* */ */Name : uspMyProc /* /* /* /* /* /* ( */ */ */ */ */ */ Description: Stored should procedure to demonstrate what a specimen comment banner look like. Parameters : @Parameter1 int, /* First parameter passed into procedure. */ */ */ /* --------------------------------------------------------------------------- @Parameter2 int /* Second parameter passed into procedure. ) /* /* /* /* /* /* /* /* */ */ */ */ */ */ */ */ Change History ~~~~~~~~~~~~~~ Version ------- 1.0 Author -------------------- C. J. Adkin Date -------- 09/08/11 Ticket ------ 3525 Description ------------------------------------ Initial version created. /*===================================================================================*/ AS BEGIN . . Comment Banner Example
  9.  -- This is an example of an inline comment Why are these bad ? Because a careless backspace can turn a useful statement into a commented out one. But my code is always thoroughly tested NO EXCSUSE, always code defensively Use /* */ comments instead. Use Of Inline Comments
  10. Use and adhere to naming conventions. Use meaningful object names. Never prefix application stored procedures with sp SQL Server will always scan through the system catalogue first, before executing such procedures Bad for performance Naming Conventions
  11. Use ANSI SQL join syntax over none ANSI syntax. Be consistent when Camel case Pascal case Use of upper case Be consistent when using case:- for reserved key words indenting and stacking text. Code Readability
  12. Coding for Performance
  13.  Never blindly take technical hints and tips written in a blog or presentation as gospel.  Test your assumptions using “Scientific method”, i.e.:-  Use test cases which use consistent test data across all tests, production realistic data is preferable.  If the data is commercially sensitive, e.g. bank account details, keep the volume and distribution the same, obfuscate the sensitive parts out.  Only change one thing at a time, so as to be able to gauge the impact of the change accurately and know what effected the change. The “Scientific Method” Approach
  14. • For performance related tests always clear the procedure and buffer cache out, so that results not skewed between tests, use the following:- are – – – CHECKPOINT DBCC FREEPROCCACHE DBCC DROPCLEANBUFFERS The “Scientific Method” Approach
  15.  A term coined by Jeff Moden, a MVP and frequent poster on SQL Server . Alludes to:- Coding in procedural 3GL way instead of a set based way. Chronic performance of row by row oriented processing. Abbreviated to RBAR, pronounced Ree-bar. Avoid “Row by agonising row” Techniques
  16. Code whereby result sets and table contents are processed line by line, typically using cursors. Correlated subqueries. User Defined Functions. Iterating through results sets as ADO objects in Server Integration Services looping containers. SQL Where “Row by agonising row” Takes Place
  17. A simple, but contrived query written against the AdventureWorkds2008R2 database. The first query will use nested subqueries. The second will use derived tables. Sub-Query Example
  18. SELECT ProductID, Quantity AdventureWorks.Production.ProductInventory PiFROM WHERE LocationID = (SELECT TOP 1 LocationID AdventureWorks.Production.Location Loc Pi.LocationID = Loc.LocationID CostRate = (SELECT MAX(CostRate) FROM WHERE AND FROM AdventureWorks.Production.Location) ) Sub-Query Example With RBAR
  19. SELECT ProductID, Quantity FROM (SELECT TOP 1 LocationID AdventureWorks.Production.Location Loc CostRate = (SELECT MAX(CostRate) FROM WHERE FROM AdventureWorks.Production.Location) ) dt, AdventureWorks.Production.ProductInventory Pi Pi.LocationID = dt.LocationIDWHERE Sub-Query Example Without RBAR
  20.     What is the difference between the two queries ?. Query 1, cost = 0.299164 Query 2, cost = 0.0202938 What is the crucial difference ?  Table spool operation in the first plan has been executed 1069 times.  This happens to be the number of rows in the ProductInventory table. The RBAR Versus The Non-RBAR Approach Quantified
  21.  Row oriented processing may be unavoidable under certain circumstances:-  The processing of one row depends on the state of one or more previous rows in a result set.  The row processing logic involves a change to the global state of the database and therefore cannot be encapsulated in a function. In this case there are ways to use cursors in a very efficient manner  As per the next three slides.  Efficient Techniques For RBAR When It Cannot Be Avoided
  22.  Elapsed time 00:22:27.892 DECLARE @MaxRownum @OrderId @i int, int, int; SET @i = 1; CREATE TABLE #OrderIds ( rownum OrderId int IDENTITY (1, 1), int ); INSERT SELECT FROM INTO #OrderIds SalesOrderID Sales.SalesOrderDetail; SELECT FROM @MaxRownum = MAX(rownum) #OrderIds; WHILE @i < BEGIN SELECT FROM WHERE @MaxRownum @OrderId = OrderId #OrderIds rownum = @i; SET @i = @i + 1; END; RBAR Without A Cursor
  23. Elapsed time 00:00:03.106 DECLARE @s int; DECLARE c CURSOR FOR SELECT FROM SalesOrderID Sales.SalesOrderDetail; OPEN c; FETCH NEXT FROM c INTO @s; WHILE @@FETCH_STATUS = 0 BEGIN FETCH NEXT FROM c INTO @s; END; CLOSE c; DEALLOCATE c; RBAR With A Cursor
  24. Elapsed time 00:00:01.555 DECLARE @s int; DECLARE c CURSOR FAST_FORWARD FOR SELECT FROM SalesOrderID Sales.SalesOrderDetail; OPEN c; FETCH NEXT FROM c INTO @s; WHILE @@FETCH_STATUS = 0 BEGIN FETCH NEXT FROM c INTO @s; END; CLOSE c; DEALLOCATE c; RBAR With An Optimised Cursor
  25.  No T-SQL language feature is a “Panacea to all For example:- Avoid RBAR logic where possible Avoid nesting cursors But cursors do have their uses. Be aware of the FAST_FORWARD optimisation, applicable when:- The data being retrieved is not being modified The cursor is being scrolled through in a forward only direction ills”. Cursor “RBAR” Morale Of The Story
  26.  When using SQL Server 2005 onwards:- Use TRY CATCH blocks. Make the event logged in CATCH block verbose enough to allow the exceptional event to be easily tracked down. NEVER use exceptions for control flow, illustrated with an upsert example in the next four slides. NEVER ‘Swallow’ exceptions, i.e. catch them and do nothing with them. Exception Handling
  27. DECLARE @p int; DECLARE c CURSOR FAST_FORWARD FOR SELECT ProductID FROM Sales.SalesOrderDetail; OPEN c; FETCH NEXT FROM c INTO @p; WHILE @@FETCH_STATUS = 0 BEGIN FETCH NEXT FROM c INTO @p; /* Place the stored procedure to be tested * on the line below. */ EXEC dbo.uspUpsert_V1 @p; END; CLOSE c; DEALLOCATE c; Exceptions Used For Flow Control Test Harness
  28. CREATE TABLE SalesByProduct ( ProductID int, Sold int, CONSTRAINT [PK_SalesByProduct] ( ProductID ) ON [USERDATA] ) ON [USERDATA] PRIMARY KEY CLUSTERED Exceptions Used For Flow Control ‘Upsert’ Table
  29. Execution time = 00:00:51.200 CREATE PROCEDURE uspUpsert_V1 (@ProductID BEGIN SET NOCOUNT ON; int) AS BEGIN TRY INSERT INTO SalesByProduct VALUES (@ProductID, 1); END TRY BEGIN CATCH IF ERROR_NUMBER() = 2627 BEGIN UPDATE SET WHERE SalesByProduct Sold += 1 ProductID = @ProductID; END END CATCH; END; ‘Upsert’ Procedure First Attempt
  30. Execution time = 00:00:20.080 CREATE PROCEDURE uspUpsert_V2 (@ProductID BEGIN SET NOCOUNT ON; int) AS UPDATE SET WHERE SalesByProduct Sold += 1 ProductID = @ProductID; IF @@ROWCOUNT = 0 BEGIN INSERT INTO SalesByProduct VALUES (@ProductID, 1); END; END; ‘Upsert’ Procedure Second Attempt
  31.  With SQL Server 2008 onwards, consider using the MERGE statement for upserts, execution time = 00:00:20.904 CREATE PROCEDURE uspUpsert_V3 (@ProductID int) AS BEGIN SET NOCOUNT ON; MERGE USING AS source ON SalesByProduct AS target (SELECT @ProductID) (ProductID) (target.ProductID = source.ProductID) WHEN MATCHED THEN UPDATE SET Sold += 1 WHEN NOT MATCHED THEN INSERT (ProductID, Sold) VALUES (source.ProductID, 1); END; ‘Upsert’ Procedure Third Attempt
  32. Scalar functions this function:- are another example of RBAR, consider CREATE FUNCTION RETURNS int AS BEGIN udfMinProductQty ( @ProductID int ) RETURN ( SELECT FROM WHERE MIN(OrderQty) Sales.SalesOrderDetail ProductId = @ProductID ) END; RBAR and Scalar Functions
  33. Now lets call the function from an example query:- SELECT ProductId, dbo.udfMinProductQty(ProductId) FROM Production.Product Elapsed time = 00:00:00.746 RBAR and Scalar Functions: Example
  34. Now doing the same thing, but valued function:- using an inline table CREATE FUNCTION tvfMinProductQty @ProductId INT ) RETURNS TABLE AS RETURN ( ( SELECT MAX(s.OrderQty) AS MinOrdQty FROM Sales.SalesOrderDetail s WHERE s.ProductId = @ProductId ) RBAR and Scalar Functions A Better Approach, Using Table Value Functions
  35. Invoking the inline TVF from a query:- SELECT ProductId, (SELECT MinOrdQty FROM dbo.tvfMinProductQty(ProductId) FROM Production.Product ORDER BY ProductId ) MinOrdQty Elapsed time 00:00:00.330 RBAR and Scalar Functions A Better Approach, Using Table Value Functions
  36. Developing applications that use database perform well depends on good:- Schema design Compiled statement plan reuse. Connection management. and Minimizing the number of network round trips between the database and the tier above. Compiled Plan Reuse
  37.  Parameterise your queries in order to minimize compiling. BUT, watch out for “Parameter sniffing”. At runtime the database engine will sniff the values of the parameters a query is compiled with and create a plan accordingly. Unfortunate when the values cause plans with table scans, when the ‘Popular’ values lead to plans with index seeks. Writing Plan Reuse Friendly Code
  38.  Use the RECOMPILE hint to force the creation of a new plan. Use the optimise for hint in order for a plan to be created for ‘Popular’ values you specify. Use the OPTIMISE FOR UNKNOWN hint, to cause a “General purpose” plan to be created. Copy parameters passed into a stored procedure to local variables and use those in your query. Parameter Sniffing
  39.  For OLTP style applications:-       Transactions will be short Number of statements will be finite SQL will only affect a few rows for each execution. The SQL will be simple. Plans will be skewed towards using index seeks over table scans. Recompiles could double+ query execution time.  Therefore recompiles are undesirable for OLTP applications. When (Re)Compiles Are To Be Avoided
  40.  For OLAP style applications:- Complex queries that may involve aggregation and analytic SQL. Queries may change constantly due to the use of reporting and BI tools. May involve WHERE clauses with potentially lots of combinations of parameters. Foregoing a recompile via OPTION(RECOMPILE) may be worth taking a hit on for the benefit of a significant reduction in total execution time.  This is the exception to the rule. When Taking The Hit Of A (Re)Compile Is Worthwhile
  41. Be careful when using table variables. Statistics cannot be gathered on these The optimizer will assume they only contain one row unless the statement is recompiled This can lead to unexpected execution plans. Table variables will always inhibit parallelism in execution plans. Table Variables
  42. This applies to conditions in WHERE clauses. If a WHERE clause condition can use an index, this is said to be ‘Sargable’ A searchable argument As a general rule of thumb the use of a function on a column will suppress index usage. i.e. WHERE ufn(MyColumn1) = <somevalue> Sargability
  43.  Constructs that will always force a serial plan:- All T-SQL user defined functions. All CLR user defined functions with data access. Built in function including: @@TRANCOUNT, ERROR_NUMBER() and OBJECT_ID(). Dynamic cursors. Be Aware Of Constructs That Create Serial Regions In Execution Plans
  44.  Constructs that will always force a serial region within a plan:-          Table value functions TOP Recursive queries Multi consumer spool Sequence functions System table scans “Backwards” scans Sequence functions Global scalar aggregate Be Aware Of Constructs That Create Serial Regions In Execution Plans
  45. Advise From The SQL Server Optimizer Development Team  Craig Freedman, a former optimizer developer has some good words of advice in his “Understanding Query Processing and Query Plan in SQL Server” slide deck.  The points on the next three slides ( quoted verbatim ) come from slide 40.
  46. Watch Out For Errors In Cardinality Estimates  Watch out for errors in cardinality estimates  Errors propagate upwards; look for the root cause  Make sure statistics are up to date and accurate  Avoid excessively complex predicates  Use computed columns for overly complex expressions
  47. General Tips  Use set based queries; (almost always) avoid cursors  Avoid joining columns with mismatched data types  Avoid unnecessary outer joins, cross applies, complex sub-queries, dynamic index seeks, …  Avoid dynamic SQL (but beware that sometimes dynamic SQL does yield a better plan)  Consider creating constraints (but remember that there is a cost to maintain constraints)  If possible, use inline TVFs NOT multi-statement TVFs  Use SET STATISTICS IO ON to watch out for large numbers of physical I/Os  Use indexes to workaround locking, concurrency, and deadlock issues
  48. OLTP and DW Tips  OLTP tips:  Avoid memory consuming or blocking iterators  Use seeks not scans  DW tips:  Use parallel plans  Watch out for skew in parallel plans  Avoid order preserving exchanges
  49. • OLTP tips: – Avoid memory consuming or blocking iterators – Use seeks not scans • DW tips: – Use parallel plans – Watch out for skew in parallel plans – Avoid order preserving exchanges
  50. Miscellaneous Best Practise
  51. Leverage functionality already reinvent it, this will lead to:- More robust code Less development effort Potentially faster code Code with better readability Easier to maintain code in SQL Server, never Avoid Reinventing The Wheel
  52.  This is furnishing the code with a facility to allow its execution to be traced. Write to a tracking table And / or use xp_logevent to write to event log DO NOT make the code a “Black box” which has to be dissected statement by statement in production if it starts to fail. Code Instrumentation
  53.  Make stored procedures and functions relatively single minded in what they do. Stored procedures and functions with lots of arguments are a “Code smell” of code that:- Is difficult to unit test with a high degree of confidence. Does not lend itself to code reuse. Smacks of poor design. Favour Strong Functional Independence For Code Artefacts
  54.   Understand and use the full power of T-SQL. Most people know how to UNION results sets together, but do not know about INTERSECT and EXCEPT.  Also a lot of development effort can be saved by using T-SQL’s analytics extensions where appropriate:-      RANK() DENSE_RANK() NTILE() ROW_NUMBER() LEAD() and LAG() (introduced in Denali) Leverage The Full Power Of Transact SQL
  55. Bad Practices To Avoid
  56.  An ‘Ordinal’ in the context of the ORDER BY clause is when numbers are used to represent column positions.  If the new columns are added or their order changed in the SELECT, this query will return different results, potentially breaking the application using it. SELECT TOP 5 [SalesOrderNumber] ,[OrderDate] ,[DueDate] ,[ShipDate] ,[Status] FROM [AdventureWorks].[Sales].[SalesOrderHeader] ORDER BY 2 DESC Avoid Ordering By Ordinals
  57. SELECT * retrieves all columns from a table bad for performance if only a subset of these is required. Using columns by their names explicitly leads to improved code readability. Code is easier to maintain, as it enables the “Developer” to see in situ what columns a query is using. Avoid SELECT *
  58.  A scenario that actually happened:- A row is inserted into the customer table Customer table has a primary key based on an identity column @@IDENTITY is used to obtain the key value of the customer row inserted for the creation of an order row with a foreign key linking back to customer. The identity value obtained is nothing like the one for the inserted row – why ? Robust Code and @@IDENTITY
  59. @@IDENTITY obtains the latest identity value irrespective of the session it came from. In the example the replication merge agent inserted row in the customer table just before @@IDENTITY was used. The solution: always use SCOPE_IDENTITY() instead of @@IDENTITY. a @@IDENTITY Is Dangerous !!!
  60. SQL Tri State Logic  SQL has tri-state logic: TRUE, FALSE and NULL.  SQL data types cannot be compared to NULL using conventional comparison operators:  <some value> <> NULL  <some value> > NULL  <some value> < NULL  <some value> = NULL  Always use IS NULL, IS NOT NULL, ISNULL and COALESCE to handle NULLs correctly.
  61. NULL Always Propagate In Expressions  Expressions that includes NULL will always evaluate to NULL, e.g.:  SELECT 1 + NULL  SELECT 1 – NULL  SELECT 1 * NULL  SELECT 1 / NULL  SELECT @MyString + NULL  If this is not the behaviour you want, code around this using ISNULL or COALESCE.
  62. Use Of The NOLOCK Hint  Historically SQL Server has always used locking to enforce Isolation levels, however:  SQL Server ( 2005 onwards ) facilitates non blocking versions of the read committed and snapshot isolations levels through multi version concurrency control (MVCC).  SQL Server 2014 which uses MVCC for its in memory OLTP engine.  All Azure Database databases use the MVCC version of read committed snapshot isolation.
  63. Use Of The NOLOCK Hint  Objects can be scanned in two ways:  Allocation order, always applied to heaps, can apply to indexes.  Logical order, indexes are traversed in logical leaf node order.  Any queries against indexed tables (clustered or non- clustered) using NOLOCK that are perform allocation- ordered scans will be exposed to reading the same data twice if another session causes the page to split and the data to move during this process.
  64. Use Of The NOLOCK Hint  If a session uses a NOLOCK hint on a heap / clustered index, its reads would ignore any locks taken out on pages/rows by in-flight transactions and subsequently be able read uncommitted ( dirty ) data, if that row is in the process of being changed by another session.  If the in-flight transaction rolls back, this leaves the session in a state whereby it has read dirty data, i.e. data that has been modified outside of a safe transactional context.  Thanks to Mark Broadbent (@retracement) for checking this and the last two slides.
  65. Transaction Rollback Behaviour CREATE TABLE Test (col1 INT) BEGIN TRANSACTION INSERT INTO Test VALUES (1); UPDATE Test SET col1 = col1 + 1 WHERE 1/0 > 1; COMMIT; SELECT col1 FROM Test -- ** 1 ** row is returned CREATE TABLE Test (col1 INT) SET XACT_ABORT ON BEGIN TRANSACTION INSERT INTO Test VALUES (1); UPDATE Test SET col1 = col1 + 1 WHERE 1/0 > 1; COMMIT; SELECT col1 FROM Test -- ** No rows ** are returned. For SQL Server to automatically rollback an entire transaction when a statement raises a run time error SET XACT_ABORT must be set to ON.
  66. Contact Details ChrisAdkin8

Editor's Notes

  1. &amp;lt;number&amp;gt;
  2. &amp;lt;number&amp;gt;