Sql good practices


Published on

Good practices with SQL

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Sql good practices

  1. 1. SQL Good Practices Basic Concepts Deepak Mehtani
  2. 2. Background  RDBMS is not based on regular programing paradigm  It operates on mathematical concept of sets  Sets are groups that have union or intersection or – or X operations  Most misunderstood concept – sets are ordered?  No sets do not guarntee order – very important to know in terms of RDBMS 11/7/2013 2
  3. 3. Strategic Imperatives  The approach should provide a stable implementation and ease of management  Provide ability to adapt new design paradigms moving forward  Ease of maintenance for prior code by means of documentation and code clarity 11/7/2013 3
  4. 4. What’s new in SQL Server 2012  Editions & licensing   Three versions – Standard, Business Intelligence and Enterprise xVelocity – is Microsoft SQL Server's family of memory-optimized and in-memory technologies. These are next-generation technologies built for extreme speed on modern hardware systems with large memories and many cores  xVelocity In-Memory Analytics Engine (used in PowerPivot and Analysis Services)  xVelocity Memory-Optimized Columnstore Index (used in the SQL Server database).  Self Bi – PowerView  Data compression – high performance  Data Quality - Maintain the quality of data and ensure that the data is suited for business usage 11/7/2013 4
  5. 5. xVelocity – ColumnStore  Column store - In a column store, values from a single column (from multiple rows) are stored contiguously, potentially in a compressed form  Relational database management systems traditionally store data in row-wise fashion. The values comprising one row are stored contiguously on a page. We sometimes refer to data stored in row-wise fashion as a row store  Columnstore index – In SQL Server, a columnstore index is data stored in column-wise fashion that can be used to answer a query just like data in any other type of index  A columnstore index appears as an index on a table when examining catalog views or the Object Explorer in Management Studio  The query optimizer considers the columnstore index as a data source for accessing data just like it considers other indexes when creating a query plan More information: http://social.technet.microsoft.com/wiki/contents/articles/3540.sql-servercolumnstore-index-faq-en-us.aspx 11/7/2013 5
  6. 6. Power View  Power View is a feature of SQL Server 2012 Reporting Services that is an interactive data exploration, visualization, and presentation experience  Provides intuitive ad-hoc reporting for business users such as data analysts, business decision makers, and information workers  Users can easily create and interact with views of data from data models based on PowerPivot workbooks published in a PowerPivot Gallery, or tabular models deployed to SQL Server 2012 Analysis Services (SSAS) instances  Power View is a browser-based Silverlight application launched from SharePoint Server 2010 that enables users to present and share insights with others in their organization through interactive presentations 11/7/2013 6
  7. 7. Data Compression  DBA can compress tables and indexes to conserve storage space at a slight CPU cost. One of the main design goals of data compression was to shrink data warehouse fact tables  SQL Server provides two methods, Page and Row compression, to reduce data storage on disk and speed I/O performance by reducing the amount of I/O required for transaction processing. Page and row compression work in different, yet complementary  Page compression uses an algorithm called “deduplication.” When deduplicating, as the name implies, SQL Server looks for duplicate values that appear again and again within the data page  Using page compression, SQL Server can remove such duplicate values within a data page by replacing each duplicate value with a tiny pointer to a single appearance of the full value  By comparison, row compression does not actually use a compression algorithm per se. Instead, when row compression is enabled, SQL Server simply removes any extra, unused bytes in a fixed data type column, such as a CHAR(50) column  Page and row compression are not compatible, but by enabling page compression SQL Server automatically includes row compression 11/7/2013 7
  8. 8. Data Quality Service  Aggregating data from different sources that use different data standards can result in inconsistent data, as can applying an arbitrary rule or overwriting historical data. Incorrect data affects the ability of a business to perform  Aggregating data from different sources that use different data standards can result in inconsistent data, as can applying an arbitrary rule or overwriting historical data. Incorrect data affects the ability of a business to perform  Features include  Data Cleansing – modification of incorrect or incomplete data  Matching – identification of semantic duplicates  Reference Data Services – verification of quality of data using reference data provider  Profiling – analysis of a data source to provide insight into the quality of the data  Monitoring – tracking and determination of the state of data quality activities  Knowledge Base – Data Quality Services is a knowledge-driven solution that analyzes data based upon knowledge that you build with DQS. 11/7/2013 8
  9. 9. Layered Coding Approach  Standardized way for coding   Code level comments   Layered coding – database, business and user interface Standardized approach on both application and database level Primary focus of this presentation is at database designing and coding 11/7/2013 9
  10. 10. Resource Utilization  Non-scalable code  Only so many people can access at the same time  This is caused by   Inefficient use of resources  11/7/2013 Resource locking Unnecessary use of locks or transactions that were never committed or rolled back 10
  11. 11. Unstructured Data – Paradigm Shift  Non-scalable code  Not only reduces performance  Increases maintenance  Requires more space  Disk space  Disk defragmentation  Requires continuous log file management, sizing and maintenance  Increased maintenance cost for the database 11/7/2013 11
  12. 12. Basic Design – that we forget  Defining a table  Key points  Identity columns  Define primary key (if not easily defined then use identity columns)  Define indices  Huge performance difference using index and primary key  Also helps in joining other tables 11/7/2013 12
  13. 13. Truncate or Delete  Truncate vs. Delete    Using truncate is better than delete Why? What do you think is better for import tables? 11/7/2013 13
  14. 14. Looping in SQL  Cursors  Necessary evil  Resource hog – disk, network bandwidth (for result transmission line by line)  Use read only cursor, if there is no update with fast forward option and auto fetch to get some performance gain   http://technet.microsoft.com/enus/library/aa172573(SQL.80).aspx Avoid cursors as much as possible 11/7/2013 14
  15. 15. Transactions  Using Transactions  Helpful in recovering from a problem / unexpected situation  Transactions should start as late as possible in the procedure and end as early as possible to reduce the locking time  Make sure transactions are always rolled back or committed  Handle error using @@error and roll back transactions in such a case  Use “with Mark” option to add the name to the transaction log that can be used as a restore point if needed  11/7/2013 http://msdn.microsoft.com/en-us/library/ms188929.aspx 15
  16. 16. Locks & Deadlocks  Deadlocks and using no locks  Deadlock occurs when two users have lock on separate objects and each user is trying to lock other user’s resource  These are automatically detected and resolved by SQL with one of the transactions rolled back with an error code of 1205  Using no locks may be helpful  No locks does not lock a record for read or write  Advantage?  Pit fall?  Read isolation transaction better than no lock 11/7/2013 16
  17. 17. Temporary Tables  Temp tables  Using # tables vs. @tables  # tables – pros and cons?  @tables – pros and cons?  Which one should be used for a stored procedure that is called very frequently 11/7/2013 17
  18. 18. User Defined Functions  Using User Defined Functions (UDF)  UDF are useful for calculations or results based on some input  UDF are slower than built-in functions  What will happen if we use a UDF in a  Select statement?  Join condition?  Where clause?  Any of the above in a while loop or cursor? 11/7/2013 18
  19. 19. Capturing Error  Using @@error (also trapping errors at application level to close sql connection and roll back transactions)  Checking @@error variable after and insert and / or update can help us determining if we need to roll back or commit a transaction  Using Try-catch construct BEGIN TRY -- Generate divide-by-zero error. SELECT 1/0; END TRY BEGIN CATCH -- Execute error retrieval routine. EXECUTE usp_GetErrorInfo; END CATCH; 11/7/2013 19
  20. 20. Joins vs. exists or in and not exists or not in  Using left joins with null conditions instead of not exist  Example below Select z.first_name+' '+z.last_name AS [name], z.[user_id] From users z inner join usergroup zug ON z.[userid] = zug.[userid] and (z.first_name +z.last_name is not null) and z.[user_id] NOT IN (SELECT [user_id] FROM Superuser) inner join groups zg ON zug.group_id = zg.group_id Where zg.[name]='ADMINISTRATOR' Order by [name]  We could probably replace the not in with a left join on a = null  Joins are faster than “in” and “not in” and use index 11/7/2013 20
  21. 21. Low hanging fruits  Should Select * be used more frequently?  Scope_Identity vs. select max?  Utilize query analyzer / management studio to view query plan  Analyze for optimization and potential scalability problems  Analyze any potential bottleneck / blocking, use sql trace  Avoid dynamic queries  Use joins potentially on primary key and / or indexed columns  Do not use SP_ in the name of a stored procedure   First reference is checked in master database Using count(*) vs. Count(Primary key) column  Need to have an ID index 11/7/2013 21
  22. 22. Sub-Queries  Sub-Queries  Avoid sub-queries if possible, they can be a resource hog  ClaimBatch_listDetailXml: (((@searchClaimType = 'ALL'AND @userId IN( SELECT UserId FROM NSFClaimTrack nct WHERE nct.UserId = @userId 11/7/2013 22
  23. 23. Conclusion  These are guidelines that will help in  Building a robust product  Standardized way for all programmer  Ease of understanding of code  Clear logic understanding  Easy maintenance 11/7/2013 23
  24. 24. Avoid This! 11/7/2013 24