DATABASE DEVELOPMENT AND CODING STANDARDS SQL & Database Guidelines
INDEX 1. NAMING CONVENTIONS 2. DECLARING VARIABLES 3. SELECT STATEMENTS 4. CURSORS 5. WILDCARD CHARACTERS 6. NOT EQUAL OPERATORS 7. DERIVED TABLES 8. SQL BATCHES 9. ANSI-STANDARD JOIN CLAUSES 10. STORED PROCEDURES NAMING CONVENTION 11. USING VIEWS 12. TEXT DATA TYPES 13. INSERT STATEMENTS 14. ACCESSING TABLES 15. STORED PROCEDURE RETURNING VALUES 16. OBJECT CASE 17. T-SQL VARIABLES 18. OFFLOAD TASKS 19. CHECK FOR RECORD EXISTENCE 20. OBJECT OWNER 21. UPSERT STATEMENTS 22. DATETIME COLUMNS 23. MEASURE QUERY PERFORMANCE 24. INDEXES
1 - Naming ConventionsAll T-SQL Keywords must be upper case.All declared variable names must be Camel Case while all stored procedure names, function names, triggernames, Table names and Columns names in query must be Pascal Case.All view names must start with the letter ‘v’ followed by the name of the view in Pascal CaseExample:SELECT * FROM Employee WHERE ID = 2DECLARE @minSalary intCREATE PROCEDURE GetEmployeesIf you are creating a table belonging to a specific module, make sure to append a 3 character prefix before thename of each table, example:LABResultLABSpecimenLABOrderRADImageRADResultNote that all table names must be singular.When creating columns, make sure to append a ‘_F’ to the end of each column you intend to use as a flag. Ifthere are exactly two statuses for the flag, use ‘bit’ data type, if there are 3 or more statuses, use ‘char(1)’ datatype. If the column is foreign key reference, append ‘_FK’ to the end of the column name. This makes it easy todistinguish flag and foreign key columns:CREATE TABLE Employee(ID INT IDENTITY NOT NULL PRIMARY KEY,FirstName varchar(max),Sex_F BIT,Person_FK int,Status_F CHAR(1))2 - Declaring VariablesAlways declare variables at the top of your stored procedure and set their values directly after declaration. If yourdatabase runs on SQL Server 2008, you can declare and set the variable on the same line. Take a look at thefollowing statement under SQL 2000/SQL 2005 and the second statement under SQL 2008. Standardprogramming language semantics are added in SQL 2008 for short assignment of values:DECLARE @i intSET @i = 1SET @i = @i + 1-------------------DECLARE @i int = 1SET @i +=1
3 - Select StatementsDo not use SELECT * in your queries. Always write the required column names after the SELECT statement. Thistechnique results in reduced disk I/O and better performance:SELECT CustomerID, CustomerFirstName, City From CustomerIf you need to write a SELECT statement to retrieve data from a single table, don’t SELECT the data from a viewthat points to multiple tables. Instead, SELECT the data from the table directly, or from a view that only containsthe table you are interested in. If you SELECT the data from the multi-table view, the query will experienceunnecessary overhead, and performance will be hindered.4 - CursorsTry to avoid server side cursors as much as possible. Always stick to a ‘set-based approach’ instead of a‘procedural approach’ for accessing and manipulating data. Cursors can often be avoided by using SELECTstatements instead.If a cursor is unavoidable, use a WHILE loop instead. A WHILE loop is always faster than a cursor. But for a WHILEloop to replace a cursor you need a column (primary key or unique key) to identify each row uniquely.5 - Wildcard CharactersTry to avoid wildcard characters at the beginning of a word while searching using the LIKE keyword, as that resultin an index scan, which defeats the purpose of an index. The following statement results in an index scan, whilethe second statement results in an index seek:SELECT EmployeeID FROM Locations WHERE FirstName LIKE %liSELECT EmployeeID FROM Locations WHERE FirsName LIKE a%i6 - Not Equal OperatorsAvoid searching using not equals operators (<> and NOT) as they result in table and index scans.7 - Derived TablesUse ‘Derived tables’ wherever possible, as they perform better. Consider the following query to find the secondhighest salary from the Employees table:SELECT MIN(Salary) FROM Employees WHERE EmpID IN (SELECT TOP 2 EmpID FROM Employees ORDER BYSalary Desc)The same query can be re-written using a derived table, as shown below, and it performs twice as fast as theabove query:SELECT MIN(Salary) FROM (SELECT TOP 2 Salary FROM Employees ORDER BY Salary DESC)This is just an example, and your results might differ in different scenarios depending on the database design,indexes, volume of data, etc. So, test all the possible ways a query could be written and go with the most efficientone.8 - SQL BatchesUse SET NOCOUNT ON at the beginning of your SQL batches, stored procedures and triggers in productionenvironments.
This suppresses messages like ‘(1 row(s) affected)’ after executing INSERT, UPDATE, DELETE and SELECTstatements. This improves the performance of stored procedures by reducing network traffic.9 - ANSI-Standard Join ClausesUse the more readable ANSI-Standard Join clauses instead of the old style joins. With ANSI joins, the WHEREclause is used only for filtering data. Whereas with older style joins, the WHERE clause handles both the joincondition and filtering data. The first of the following two queries shows the old style join, while the second oneshow the new ANSI join syntax:SELECT a.au_id, t.title FROM titles t, authors a, titleauthor ta WHEREa.au_id = ta.au_id ANDta.title_id = t.title_id ANDt.title LIKE %Computer%----------------------------------------------SELECT a.au_id, t.titleFROM authors aINNER JOIN titleauthor taONa.au_id = ta.au_idINNER JOIN titles tONta.title_id = t.title_id WHERE t.title LIKE %Computer%10 - Stored Procedures Naming ConventionDo not prefix your stored procedure names with “sp_”. The prefix sp_ is reserved for system stored procedurethat ship with SQL Server. Whenever SQL Server encounters a procedure name starting with sp_, it first tries tolocate the procedure in the master database, then it looks for any qualifiers (database, owner) provided, then ittries dbo as the owner.So you can really save time in locating the stored procedure by avoiding the “sp_” prefix.11 - Using ViewsViews are generally used to show specific data to specific users based on their interest. Views are also used torestrict access to the base tables by granting permission only on views. Yet another significant use of views isthat they simplify your queries.Incorporate your frequently required, complicated joins and calculations into a view so that you don’t have torepeat those joins/calculations in all your queries. Instead, just select from the view.12 - Text Data TypesTry not to use TEXT or NTEXT data types for storing large textual data.The TEXT data type has some inherent problems associated with it and will be removed from future version ofMicrosoft SQL Server.For example, you cannot directly write or update text data using the INSERT or UPDATEStatements. Instead, you have to use special statements like READTEXT, WRITETEXT and UPDATETEXT.There are also a lot of bugs associated with replicating tables containing text columns.So, if you don’t have to store more than 8KB of text, use CHAR(8000) or VARCHAR(8000) data types instead.In SQL 2005 and 2008, you can use VARCHAR(max) for storing unlimited amount of textual data.
13 - Insert StatementsAlways use a column list in your INSERT statements. This helps in avoiding problems when the table structurechanges (like adding or dropping a column).14 - Accessing TablesAlways access tables in the same order in all your stored procedures and triggers consistently. This helps inavoiding deadlocks. Other things to keep in mind to avoid deadlocks are:1. Keep your transactions as short as possible. Touch as few data as possible during a transaction.2. Never, ever wait for user input in the middle of a transaction.3. Do not use higher level locking hints or restrictive isolation levels unless they are absolutely needed.4. Make your front-end applications deadlock-intelligent, that is, these applications should be able to resubmitthe transaction incase the previous transaction fails with error 1205.5. In your applications, process all the results returned by SQL Server immediately so that the locks on theprocessed rows are released, hence no blocking.15 - Stored Procedure Returning ValuesMake sure your stored procedures always return a value indicating their status. Standardize on the return valuesof stored procedures for success and failures.The RETURN statement is meant for returning the execution status only, but not data. If you need to return data,use OUTPUT parameters.If your stored procedure always returns a single row result set, consider returning the result set using OUTPUTparameters instead of a SELECT statement, as ADO handles output parameters faster than result sets returned bySELECT statements.16 - Object CaseAlways be consistent with the usage of case in your code. On a case insensitive server, your code might workfine, but it will fail on a case sensitive SQL Server if your code is not consistent in case.For example, if you create a table in SQL Server or a database that has a case-sensitive or binary sort order; allreferences to the table must use the same case that was specified in the CREATE TABLE statement.If you name the table as ‘MyTable’ in the CREATE TABLE statement and use ‘mytable’ in the SELECT statement,you get an ‘object not found’ error.17 - T-SQL VariablesThough T-SQL has no concept of constants (like the ones in the C language), variables can serve the samepurpose. Using variables instead of constant values within your queries improves readability and maintainabilityof your code. Consider the following example:SELECT OrderID, OrderDate FROM Orders WHERE OrderStatus IN (5,6)The same query can be re-written in a mode readable form as shown below:DECLARE @ORDER_DELIVERED, @ORDER_PENDINGSELECT @ORDER_DELIVERED = 5, @ORDER_PENDING = 6SELECT OrderID, OrderDate FROM OrdersWHERE OrderStatus IN (@ORDER_DELIVERED, @ORDER_PENDING)18 - Offload tasksOffload tasks, like string manipulations, concatenations, row numbering, case conversions, type conversions etc.,to the front-end applications if these operations are going to consume more CPU cycles on the database server.
Also try to do basic validations in the front-end itself during data entry. This saves unnecessary networkroundtrips.19 - Check for record ExistenceIf you need to verify the existence of a record in a table, don’t use SELECT COUNT (*) in your Transact-SQL codeto identify it, which is very inefficient and wastes server resources. Instead, use the Transact-SQL IF EXITS todetermine if the record in question exits, which is much more efficient. For example:Here’s how you might use COUNT(*):IF (SELECT COUNT(*) FROM table_name WHERE column_name = xxx)Here’s a faster way, using IF EXISTS:IF EXISTS (SELECT * FROM table_name WHERE column_name = xxx)The reason IF EXISTS is faster than COUNT(*) is because the query can end immediately when the text is proventrue, while COUNT(*) must count go through every record, whether there is only one, or thousands, before it canbe found to be true.20 - Object OwnerFor best performance, all objects that are called from within the same stored procedure should all be owned bythe same owner, preferably dbo. If they are not, then SQL Server must perform name resolution on the objects ifthe object names are the same but the owners are different. When this happens, SQL Server cannot use a storedprocedure “in-memory plan” over, instead, it must re-compile the stored procedure, which hinders performance.There are a couple of reasons, one of which relates to performance. First, using fully qualified names helps toeliminate any potential confusion about which stored procedure you want to run, helping to prevent bugs andother potential problems. But more importantly, doing so allows SQL Server to access the stored proceduresexecution plan more directly, and in turn, speeding up the performance of the stored procedure. Yes, theperformance boost is very small, but if your server is running tens of thousands or more stored procedures everyhour, these little time savings can add up.21 - Upsert StatementsSQL Server 2008 introduces Upsert statements which combine insert, update, and delete statements in one‘Merge’ statement.Always use the Merge statement to synchronize two tables by inserting, updating, or deleting rows in one tablebased on differences found in the other tableMERGE table1 AS targetUSING (SELECTID,NameFROM table2) AS source (ID,Name)ON(target.Table2ID = source.ID)WHEN NOT MATCHED AND target.Name IS NULL THENDELETE
WHEN NOT MATCHED THENINSERT (name, Table2ID)VALUES(name + not matched, source.ID)WHEN MATCHED THENUPDATESET target.name = source.name + matchedOUTPUT $action,inserted.id,deleted.id;22 - DateTime ColumnsAlways use ‘datetime2’ data type in SQL 2008 instead of the classic ‘datetime’. Datetime2 offers optimized datastorage by saving 1 additional byte from the classic datetime. It has a larger date range, a larger defaultfractional precision, and optional user-specified precision.If your column is supposed to store the date only portion, use the ‘date’ date type while if you want to store thetime portion, use the ‘time’ data type. Below is a list of examples of these new data types look like:time 12:35:29. 1234567date 2007-05-08smalldatetime 2007-05-08 12:35:00datetime 2007-05-08 12:35:29.123datetime2 2007-05-08 12:35:29. 1234567datetimeoffset 2007-05-08 12:35:29.1234567 +12:1523 - Measure Query PerformanceAlways use statistics time feature to measure your important query and stored procedure’s performance. Usestatistics time to optimize your queries Take a look at this example:SET STATISTICS TIME ONEXEC GetMedicalProcedures 1,10SET STATISTICS TIME OFFThe below information will be displayed in the Messages tab:SQL Server parse and compile time:CPU time = 6 ms, elapsed time = 6 ms.SQL Server Execution Times:CPU time = 24 ms, elapsed time = 768 ms.(10 row(s) affected)SQL Server Execution Times:CPU time = 0 ms, elapsed time = 125 ms.SQL Server Execution Times:CPU time = 16 ms, elapsed time = 131 ms.This provides a good estimation of how long the query took to be executed, showing the CPU time (processingtime) and elapsed time (CPU + I/O).24 - IndexesCreate indexes on tables that have high querying pressure using select statements. Be careful not to create anindex on tables that are subject to real-time changes using CRUD operations.An index speeds up a select clause if the indexed column is included in the query, especially if it is in the WHEREclause. However, the same index slows down an insert statement whether or not the indexed column is included
in the query. This downside occurs because indexes readjust and update statistics every time the table structureis changed. So use indexes wisely for optimizing tables having high retrieval rate and low change rate.