SQL Server 2012 Best Practices


Published on

Presented by Karel Coenye.

Published in: Technology
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Maximizing Performance and Working Around Columnstore LimitationsFollow the links to the topics listed below about how to maximize performance with columnstores indexes, and work around their functional and performance limitations in SQL Server 2012.Ensuring Use of the Fast Batch Mode of Query Execution    Parallelism (DOP >= 2) is Required to Get Batch ProcessingUse Outer Join and Still Get the Benefit of Batch ProcessingWork Around Inability to get Batch Processing with IN and EXISTSPerform NOT IN and Still Get the Benefit of Batch ProcessingPerform UNION ALL and Still Get the Benefit of Batch ProcessingPerform Scalar Aggregates and Still get the Benefit of Batch ProcessingMaintaining Batch Processing with Multiple Aggregates Including one or More DISTINCT AggregatesUsing HASH JOIN hint to avoid nested loop join and force batch processing Physical Database Design, Loading, and Index ManagementAdding Data Using a Drop-and-Rebuild ApproachAdding Data Using Partition SwitchingTrickle Loading with Columnstore IndexesAvoid Using Nonclustered B-tree IndexesChanging Your Application to Eliminate Unsupported Data Types Achieving Fast Parallel Columnstore Index BuildsMaximizing the Benefits of Segment EliminationUnderstanding Segment EliminationVerifying Columnstore Segment EliminationEnsuring Your Data is Sorted or Nearly Sorted by Date to Benefit from Date Range EliminationMulti-Dimensional Clustering to Maximize the Benefit of Segment EliminationAdditional Tuning Considerations Work Around Performance Issues for Columnstores Related to StringsForce Use or Non-Use of a Columnstore IndexWorkarounds for Predicates that Don't Get Pushed Down to Columnstore Scan (Including OR)Using Statistics with Columnstore Indexes 
  • SQL Server 2012 Best Practices

    1. 1. SQL Server Best PracticesMicrosoft TechNet Thursday
    2. 2. About me @Ryazame
    3. 3. Part 1Generic – Independant of SQL Version
    4. 4. What‟s the goal?
    5. 5. No Serious…What are these best practices General rules and guidelines Intend to improve:  Maintenance  Performance  Availability  Quality Not always 100% implementable  But at least try  Document why
    6. 6. Coding(Believe it or not...) If you don‟t do this right  It‟s like...
    7. 7. Coding Best Practices There is no performance loss for documented code  Code Tells You How, Comments Tell You Why Don‟t hardcode  SQL Server supports Variables ;-) Format your code for Readability  There is No “right” method  But… make clear agreements  And follow them
    8. 8. Windows Authentication Easier to administer Centralized Better auditing More  Secure  Flexible  Always-On  Contained Databases
    9. 9. Normalize Normalforms  Aim for 3rd Normalform  Normalize first  DEnormalize when required DEnormalization -> Sometimes OK  DEnormalization can be done using many techniques Un-Normalized
    10. 10. Data Integrity Top priority Maintained by using constraints  Sometimes you‟ll have to rely on triggers Never trust code outside of the table to assure data integrity of a table Primary Keys Should ALWAYS exist  Even if you‟ll have to make a surrogate key  Declare your alternate keys Declared Referential Integrity  Foreign Keys (Fast)  If there is absolutely no other choice -> Trigger code (Slow)
    11. 11. Data Integrity Limit your column data  Similar to Referential, but there is no table holding your values  Why not build that table?  Easier to manage, easier to scale Use check constraints Limit your data type  Why is everyone afraid of the “big” bad tinyint?  Or even worse, the bit…
    12. 12. Clustered Index Your table should have one  Unless in very specific well documented cases, it will be faster The primary key is usually NOT the best choice  It is the default Best choice can only be determined by usage  If usage determines the PK to be the best choice, then it is! Always keep partitioning in mind  Should be your (range)-scan-key
    13. 13. Non-Clustered Indexes OLTP vs. OLAP Avoid having more indexes then data...  This is what makes a lot of databases SLOW² Think about Scan vs. Seek Think about entry points Be carefull with:  composite indexes with more then 2 columns  ABC <> BCA <> BAC -> If you‟re not carefull you‟ll be creating all 3  Included columns  Don‟t include 90% of your table  Filtered Indexes  Know your logic and test!
    14. 14. Think about... Null‟s  Generates quite some overhead  Has a meaning <> „None‟ Datatypes  Don‟t overuse (n)varchar(max), think about the content  Examples  Telephone numbers (exists out of 4 blocks that all can have prefix 0) – E.164 standard  Country Code (Max 3) | regio code + Number (max 15) | Extention Max (4)  „00999-(0)1-123.12.23 ext1234‟ [varchar(33)] (2+33 bytes= 35 bytes)  „+99911231223‟,‟1234‟ [varchar(18)]+[varchar(4)] (2+18 + 2+4 bytes= 26 bytes)  tinyint,smallint | tinyint, tinyint | tinyint, int, int (1+2+1+1+1+4 (+4) = 10 + 4 Bytes)  Length, Value | Length, Value | Length, Value | Extention -> Other table (to avoid Nulls)
    15. 15. Bad Data types -> Avoid TEXT  String functions are limited  Indexing becomes useless  LARGE NTEXT  … No Comment FLOAT, REAL  Approximate numeric values  Not exact!  Can give “funny“ error‟s 1E-999 <> 0
    16. 16. Char vs. VarcharAction Char VarcharLength Known UnknownFragmentation Easier to control Bad with updatesFlexibility None (From 1 to 8000) From 1 to MAXFrequent Updates Size is allocated Needs to resize/splitIndex able Supports Online DependsNull size Full size is allocated + Overhead OverheadAvoid (When Possible) Empty space / Nulls MAX
    17. 17. SET-based SQL is a set based language  The optimizer is designed to do just that  Batch-mode  Typically represents 1000 rows of data.  Optimized for the multicore CPUs and increased memory throughput.  Batch mode processing spreads metadata costs and overhead.
    18. 18. UDF‟s User defined functions  Make code easier to read  Make code easier to write  Be careful with non-deterministic  Can have a very negative impact on performance
    19. 19. Select * Never use Select * Avoid operators that don‟t use your indexes Explicit column lists  Are less error prone  Easier to debug  Reduce Disk IO  More Maintainable  Columns can be added or re-positionned
    20. 20. Always Use Begin and END  Even if it only contains one statement Use schema name  There is a slight performance improvement  Makes code more readable Use table alias  Even when not joining  Eliminated ambiguity  Reduce typo chance  Assist intellisence Set Nocount on
    21. 21. Always Use ANSI join syntax  TSQL join syntax can return incorrect results  Is deprecated  Easier to read
    22. 22. Avoid Table Hints Index Hints Join Hints Lock Hints (this should be done on a higher level) Very rare for the optimizer not to choose the best plan Triple check your query (and do so with the full dataset) Hints break your DBA‟s ability to tune the database
    23. 23. Be careful with Dynamic SQL  If used wrongly, it will perform slower  Increased security risks because it does not take part in ownership chaining @@Identity  Can return wrong values if used in combination with triggers  Use SCOPE_IDENTITY or IDENT_CURRENT() instead TRUNCATE  Minimally logged  Doesn‟t fire Triggers  Cannot use schema binding
    24. 24. Stored Procedures Anticipate debug  You can add a @Debug flag that talks or logs more Make sure your stored procedures return values Call SP‟s with their parameter names  Easier to read  More error free, because you can switch order Error handling Handle your nested transactions!
    25. 25. Temp Tables vs. Table Variable vs.Table Parameters Size does matter Test! Consider derived tables or CTE‟s Never forget IO and scaling Check your query plans Think careful about the order of execution  Take into consideration indexing  Query plan regeneration  Default values
    26. 26. Avoid String = “Expression”  Both in selects as in Where clauses Be careful with NULL‟s  A Null value has a meaning  And it doesn‟t mean “default” or “not available”
    27. 27. ANSI/ISO Standards Use ANSI standards where possible  ISNULL vs. Coalesce  CURRENT_TIMESTAMP vs. Getdate()  ROWVERSION vs. Timestamp  ANSI SETTINGS -> ON  ANSI NULLS  ANSI PADDINGS  ANSI WARNING  ARITHABORT  CONCAT_NULL_YIELDS_NULL  QUOTED IDENTIFIERS  Numeric_Roundabout -> Should be OFF Always Format your date time using ISO standards  YYYY-MM-DDTHH:MM:SS
    28. 28. Part 2 - 2012 Specific Always ON ColumnStore Indexes Contained Databases Filestore Always-On vs. Clustering vs. Mirroring
    29. 29. Always ON
    30. 30. Always-ON Superior to Mirroring (Depricated)  Pro‟s  Good wizard  Good dashboards  Same responsiveness in failover  Only One IP-adress  Multiple replica‟s  Readable replica‟s  Drop the [#@!*] snapshots  Contra  Same overhead  Same maintenance problems  Even more sensible to bad database design
    31. 31. Always-OnBe carefull with Snapshot Isolation Repeatable-read (LOCKS!) Logins Creating indexes for reporting on live databases  Overhead Backups on secondairy  Copy only for the time being TF9532 (Enable multiple replica‟s in Always on) Keep your settings compatible (ex. TF‟s) Bulk load isn‟t supported
    32. 32. Always-ONSollutions CRUD overhead  Partition! Maintenance overhead  Partition ! No “good” Index‟s for reporting vs. Overhead for OLTP  Partition ! Users/logins/SID‟s  Partition ! (kidding)  Use windows Authentication  Use sp_help_revlogin„ en automate it! Careful with maintenance plans
    33. 33. AlwaysONPerformance benefits Has huge benefits from combining it with:  Resource governour  Compression  Non-Wizard maintenance  Read-only partitions  Dedicated data-network  Local (SSD) Storage  Documentation  PARTITIONING
    34. 34. Column Store IndexesFundamentals Stores data in highly compressed format, with each column kept in a separate group of pages Use the vector-based query execution method called "batch processing“ Segment Elimination Engine pushes filters down into the scans Makes the table/partition read-only key to performance is to make sure your queries process the large majority of data in batch mode
    35. 35. Column Store IndexesDO‟s & Don‟ts Do‟s  Only on large tables  Include every column  Star joins with grouping and aggregation  BATCH mode  On the OLAP part of your database Don‟ts  String Filters on column store indexes  OUTER/CROSS JOIN  NOT IN  UNION ALL  ROW mode  ON the OLTP part of your database
    36. 36. Column Store IndexesMaximise Performance Resource governour  Maxdop >= 2 CTE‟s  Works arround not in Joins  Works arround UNION ALL Carefull with  EXISTS IN -> Inner joins Data Managment  DROP/Rebuild approach on data updates Queries can become complex, but focus on Batch mode
    37. 37. Contained DatabasesSecurity Disable the guest account Duplicate Logins  Sysadmins  Different passwords  Initial catalog Containment Status of a Database Attaching (Restricted_User mode) Kerberos Restrict access to the database file Don‟t use auto close -> DOS attacks Excaping Contained databases
    38. 38. Filetable (Disable windows Indexing on these disk volumes) Disable generation of 8.3 names (command: FSUTIL BEHAVIOR SET DISABLE8DOT3 1) Disable last file access time tracking (command: FSUTIL BEHAVIOR SET DISABLELASTACCESS 1) Keep some space empty (let us say 15% for reference) on drive if possible Defragement the volume Is supported in ALWAYSON!  If property is enabled on all servers  Using VNN‟s
    39. 39. AlwaysOnMirroring – Clustering – LogshippingContained Databases, Column Store Index AlwaysOn complements these technologies  In a Way, AlwaysOn replaces Mirroring (Depricated) Clearly a step into a new direction To optimaly use these technologies  Part 1 best practices are very important  Your database design should be as optimal as possible  Partitioning becomes a MUST  Resource governour becomes a MUST  You‟ll need the Enterprise edtion
    40. 40. Call to action Start giving feedback to your developers / 3rd party vendors NOW Start thinking about  Data flows  Data retention  Data management  Partitioning  Filegroups/Files  Data-tiering Don‟t  Restrict your view to the boundairy of a database
    41. 41. Q&A