Exploring T-SQL Anti-Patterns
37th SQL Saturday Night
Mar 21, 2020
Please mute
your mic
Antonios Chatzipavlis
Data Solutions Consultant & Trainer
Since 1999
30+Years in a Field
20+
Experience with
60+
Certifications
SQLschool.gr
Founder
A community for Greek professionals who use the
Microsoft Data Platform
Connect / Explore / Learn
@antoniosch - @sqlschool
./sqlschoolgr - ./groups/sqlschool
yt/c/SqlschoolGr
SQLschool.gr Group
help@sqlschool.gr
Join us
Articles
SQL Saturday Nights
SQL Server in Greek
Webcasts
News
Resources
Connect / Explore / Learn
10
Celebrating
2010-2020
years
If you want to be part of PASS Summit 2020
Use the following $150 USD discount code in your
registration, which is unique to
SQL School Greece.
LGDISET1I
Virtual Event
March 29, 2020
Greek MVPs in Action
G. KALYVA G. GRAMMATIKOS G. MARKOU
P. APOSTOLIDIS A. CHATZIPAVLIS
Speakers
Presentation Content
Exploring T-SQL
Anti-Patterns
We are all writing
marvelous Queries.
Correct?
• Can cause code break
• Has performance implications
• Most of the times makes your indexes useless
• Unnecessary network traffic
The use of SELECT *
The most known anti-pattern but most people are still using it
• A predicate defines a logical condition being applied to rows in a table
• Search predicates should not use deterministic function calls because
they cause unnecessary scans
• Query Optimizer
- uses statistics, internal transformation rules and heuristics at compile-time to determine a
good-enough plan to execute a query
- depends on the estimated cost to resolve the search predicates in order to choose
whether it seeks or scans over indexes.
Functions usage in predicates
In WHERE or JOINS clauses
Search predicates should not use complex expressions because they
produce unnecessary scans for the same reasons just like functions.
Complex expressions usage in predicates
In WHERE or JOINS clauses
• How OR Operator works
- Returns rows that meet any of the conditions for every criterion specified in the WHERE clause
- Progressively increases the number of rows returned with each additional search condition
• Can use one index or different indexes for each part of the OR operator
• Always performs a Table Scan or Clustered Index Scan
- if one column referenced in the OR operator does not have an index or
- if the index is not useful
• Can use multiple indexes
Using the OR operator
One of the fundamental topics in Relational Algebra
• Usage of NULL is not anti pattern at any case
• In some cases due to special handling of NULL we are facing some
challenges to write performant T-SQL
NULL usage on joins
Null means Unknown
• The problem with LIKE operator is when we use the pattern ‘%abc%’ we
always have the usage of scans operators even if we have index on the
field.
• In this case it’s better to use the Full Text Search component.
The wrong usage of LIKE operator
• !=
• <>
• NOT IN
• NOT LIKE
• …
Usage of negative comparisons
In Where or Joins clauses
• In programming languages the code reusability is desired.
• In T-SQL code reusability most of the time is performance penalty,
because the cost of query cannot be known until runtime.
Compound logic
• This is when you mismatch
data types in a WHERE clause
or JOIN condition, and SQL
Server needs to convert one
on the fly.
• The penalties you pay here
are:
- Indexes won’t be used efficiently
- Burn CPU in the conversion process
Implicit conversions
1. user-defined data types (highest)
2. sql_variant
3. xml
4. datetimeoffset
5. datetime2
6. datetime
7. smalldatetime
8. date
9. time
10. float
11. real
12. decimal
13. money
14. smallmoney
15. bigint
16. int
17. smallint
18. tinyint
19. bit
20. ntext
21. text
22. image
23. timestamp
24. uniqueidentifier
25. nvarchar (including nvarchar(max) )
26. nchar
27. varchar (including varchar(max) )
28. char
29. varbinary (including varbinary(max) )
30. binary (lowest)
• UNION , UNION ALL
• SELECT DISTINCT
• SELECT TOP(1) WITH ORDER BY
Unnecessary Sort operations
Multi-statement TVFs Scalar UDFs
• MSTVFs cost can't be determined at
compile time, so a fixed estimation
of rows is used to create the query
plan.
• Interleaved execution for MSTVFs
was introduced in SQL Server 2017.
• Try to re-write the MSTVFs as Inline
Table Valued Functions
• Query Optimizer does not account
for any T-SQL logic inside a UDF
• UDFs are executed for every row in
the result set, just like a cursor.
• Create scalar UDFs WITH
SCHEMABINDING option
Reduce the TVFs/UDFs pitfalls
For versions prior to SQL Server 2017.
• Use the SET NOCOUNT ON
- The count indicating the number of rows affected
by a T-SQL statement
- When is ON
- count won’t be returned to the application layer
- we have performance boost because SQL Server does
not send DONE_IN_POC token stream for each
statement in the code.
• Validate input parameters early in the T-
SQL code.
Optimizing stored procedures
Look at my article
The effects of SET NOCOUNT ON
http://bit.ly/2QwFAjb
Standard Views Indexed Views
• Avoid to create generic views
• Avoid Inheritance logic with views
• Read my article
How to not make a disaster view
https://bit.ly/33vTNCp
• SQL Server will only automatically
create statistics on an indexed view
when a NOEXPAND table hint is
used.
• Omitting this hint can lead to
execution plan warnings about
missing statistics that cannot be
resolved by creating statistics
manually.
Optimizing Views
• It is not uncommon to use sub-queries to express certain predicates
inline in queries
• Developers must keep in mind that joins are frequently better than
correlated sub-queries.
Correlated sub-queries
Table variables and temporary tables serve the same basic purpose:
To store an intermediate result set to be used by a subsequent query
Prior to SQL Server 2019
Table variables are runtime objects only and are compiled together with all other statements, before any of
the statements that populate the table variables even execute.
For this reason, the Query Optimizer uses a default estimation of one row for table variables, since the row
count is not available at compile-time.
On Temporary tables SQL Server supports automatic statistics creation, as
well as manual statistics creation and update, which the Query Optimizer
can use.
Table variables vs. Temporary tables
• We can use EXECUTE or sp_executesql.
• sp_executesql is the preferred method for executing dynamic T-SQL
because:
- It allows us to add parameter markers, increasing the likelihood that SQL Server will be
able to reuse the plan and avoid costly query compilations
- Of the ability to define parameter’s data type, minimizing the risk of sql injection
Executing Dynamic T-SQL statement
Thank you!
@antoniosch - @sqlschool
./sqlschoolgr - ./groups/sqlschool
./c/SqlschoolGr
SQLschool.gr Group
Antonios Chatzipavlis
Data Solutions Consultant & Trainer
A community for Greek professionals who use the Microsoft Data Platform
Copyright © SQLschool.gr. All right reserved. PRESENTER MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION

Exploring T-SQL Anti-Patterns

  • 1.
    Exploring T-SQL Anti-Patterns 37thSQL Saturday Night Mar 21, 2020
  • 3.
  • 4.
    Antonios Chatzipavlis Data SolutionsConsultant & Trainer Since 1999 30+Years in a Field 20+ Experience with 60+ Certifications SQLschool.gr Founder
  • 5.
    A community forGreek professionals who use the Microsoft Data Platform Connect / Explore / Learn @antoniosch - @sqlschool ./sqlschoolgr - ./groups/sqlschool yt/c/SqlschoolGr SQLschool.gr Group help@sqlschool.gr Join us Articles SQL Saturday Nights SQL Server in Greek Webcasts News Resources
  • 6.
    Connect / Explore/ Learn 10 Celebrating 2010-2020 years
  • 8.
    If you wantto be part of PASS Summit 2020 Use the following $150 USD discount code in your registration, which is unique to SQL School Greece. LGDISET1I
  • 9.
    Virtual Event March 29,2020 Greek MVPs in Action G. KALYVA G. GRAMMATIKOS G. MARKOU P. APOSTOLIDIS A. CHATZIPAVLIS Speakers
  • 10.
    Presentation Content Exploring T-SQL Anti-Patterns Weare all writing marvelous Queries. Correct?
  • 12.
    • Can causecode break • Has performance implications • Most of the times makes your indexes useless • Unnecessary network traffic The use of SELECT * The most known anti-pattern but most people are still using it
  • 14.
    • A predicatedefines a logical condition being applied to rows in a table • Search predicates should not use deterministic function calls because they cause unnecessary scans • Query Optimizer - uses statistics, internal transformation rules and heuristics at compile-time to determine a good-enough plan to execute a query - depends on the estimated cost to resolve the search predicates in order to choose whether it seeks or scans over indexes. Functions usage in predicates In WHERE or JOINS clauses
  • 16.
    Search predicates shouldnot use complex expressions because they produce unnecessary scans for the same reasons just like functions. Complex expressions usage in predicates In WHERE or JOINS clauses
  • 18.
    • How OROperator works - Returns rows that meet any of the conditions for every criterion specified in the WHERE clause - Progressively increases the number of rows returned with each additional search condition • Can use one index or different indexes for each part of the OR operator • Always performs a Table Scan or Clustered Index Scan - if one column referenced in the OR operator does not have an index or - if the index is not useful • Can use multiple indexes Using the OR operator One of the fundamental topics in Relational Algebra
  • 20.
    • Usage ofNULL is not anti pattern at any case • In some cases due to special handling of NULL we are facing some challenges to write performant T-SQL NULL usage on joins Null means Unknown
  • 22.
    • The problemwith LIKE operator is when we use the pattern ‘%abc%’ we always have the usage of scans operators even if we have index on the field. • In this case it’s better to use the Full Text Search component. The wrong usage of LIKE operator
  • 24.
    • != • <> •NOT IN • NOT LIKE • … Usage of negative comparisons In Where or Joins clauses
  • 26.
    • In programminglanguages the code reusability is desired. • In T-SQL code reusability most of the time is performance penalty, because the cost of query cannot be known until runtime. Compound logic
  • 28.
    • This iswhen you mismatch data types in a WHERE clause or JOIN condition, and SQL Server needs to convert one on the fly. • The penalties you pay here are: - Indexes won’t be used efficiently - Burn CPU in the conversion process Implicit conversions 1. user-defined data types (highest) 2. sql_variant 3. xml 4. datetimeoffset 5. datetime2 6. datetime 7. smalldatetime 8. date 9. time 10. float 11. real 12. decimal 13. money 14. smallmoney 15. bigint 16. int 17. smallint 18. tinyint 19. bit 20. ntext 21. text 22. image 23. timestamp 24. uniqueidentifier 25. nvarchar (including nvarchar(max) ) 26. nchar 27. varchar (including varchar(max) ) 28. char 29. varbinary (including varbinary(max) ) 30. binary (lowest)
  • 30.
    • UNION ,UNION ALL • SELECT DISTINCT • SELECT TOP(1) WITH ORDER BY Unnecessary Sort operations
  • 32.
    Multi-statement TVFs ScalarUDFs • MSTVFs cost can't be determined at compile time, so a fixed estimation of rows is used to create the query plan. • Interleaved execution for MSTVFs was introduced in SQL Server 2017. • Try to re-write the MSTVFs as Inline Table Valued Functions • Query Optimizer does not account for any T-SQL logic inside a UDF • UDFs are executed for every row in the result set, just like a cursor. • Create scalar UDFs WITH SCHEMABINDING option Reduce the TVFs/UDFs pitfalls For versions prior to SQL Server 2017.
  • 34.
    • Use theSET NOCOUNT ON - The count indicating the number of rows affected by a T-SQL statement - When is ON - count won’t be returned to the application layer - we have performance boost because SQL Server does not send DONE_IN_POC token stream for each statement in the code. • Validate input parameters early in the T- SQL code. Optimizing stored procedures Look at my article The effects of SET NOCOUNT ON http://bit.ly/2QwFAjb
  • 36.
    Standard Views IndexedViews • Avoid to create generic views • Avoid Inheritance logic with views • Read my article How to not make a disaster view https://bit.ly/33vTNCp • SQL Server will only automatically create statistics on an indexed view when a NOEXPAND table hint is used. • Omitting this hint can lead to execution plan warnings about missing statistics that cannot be resolved by creating statistics manually. Optimizing Views
  • 38.
    • It isnot uncommon to use sub-queries to express certain predicates inline in queries • Developers must keep in mind that joins are frequently better than correlated sub-queries. Correlated sub-queries
  • 40.
    Table variables andtemporary tables serve the same basic purpose: To store an intermediate result set to be used by a subsequent query Prior to SQL Server 2019 Table variables are runtime objects only and are compiled together with all other statements, before any of the statements that populate the table variables even execute. For this reason, the Query Optimizer uses a default estimation of one row for table variables, since the row count is not available at compile-time. On Temporary tables SQL Server supports automatic statistics creation, as well as manual statistics creation and update, which the Query Optimizer can use. Table variables vs. Temporary tables
  • 42.
    • We canuse EXECUTE or sp_executesql. • sp_executesql is the preferred method for executing dynamic T-SQL because: - It allows us to add parameter markers, increasing the likelihood that SQL Server will be able to reuse the plan and avoid costly query compilations - Of the ability to define parameter’s data type, minimizing the risk of sql injection Executing Dynamic T-SQL statement
  • 43.
    Thank you! @antoniosch -@sqlschool ./sqlschoolgr - ./groups/sqlschool ./c/SqlschoolGr SQLschool.gr Group Antonios Chatzipavlis Data Solutions Consultant & Trainer
  • 44.
    A community forGreek professionals who use the Microsoft Data Platform Copyright © SQLschool.gr. All right reserved. PRESENTER MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION