WITH FOCUS ON SQL SERVER
BY ARNO HUETTER
About the Author
Arno wrote his first lines of code on a Sinclair ZX80 in
Over the years, he has been programming in C/C++,
Java and C#, and also did quite some database
Today he is Development Lead at Dynatrace (APM
Background (Note: I am not a DBA. I only did some DB development)
Phoenix DB (Atari ST, storage: 3.5” floppy)
Learning (1992 - 1996):
University (80% ER modelling, 20% SQL, 0% DB internals *sighs*), Contract work
Oracle 5 (DOS), MS Access, 4th Dimension
Professional Phase 1 (1997 - 2001, still learning):
Internet Banking, Business Banking
Oracle 7 (DEC Alpha), Sybase
Professional Phase 2 (2002 - today, still learning):
Hospital Information Systems, Finance/Accounting Software, APM
Oracle 8/9 (Linux), SQL Server 2000/2005/2008/2012, Postgres
Most concepts presented here are vendor-independent, but with "SQL Server flavour"
1970: Edgar F. Codd (IBM) publishes paper "A Relational Model of Data for Large
Shared Data Banks".
1974: Raymond Boyce and Donald Chamberlin (IBM) write "SEQUEL: A Structured English
1974 - 1977: IBM implements System/R, UBC creates Ingres (later: Postgres), the first two
1976: Larry Ellison founds Oracle. Oracle's approach is based on Codd's IBM papers.
1977: Oracle 1 runs on PDP-11, using 128k memory (never officially released).
1978: IBM adds SQL to System/R. System/R eventually morphs into DB2.
1979: Oracle releases the first commercially available SQL database.
And Big Data?
Which database systems are in use at your company?
How many rows can you insert per sec?
Specification: SQL Server, row data on local client, 256 bytes per row, choose your table
design, provider, API. Now guess!
On a highly-tuned setup (SSIS, split load / parallelization, special hw):
1,000,000s of rows / sec
On your off-the shelf notebook (bulk insert, heap table or suited clustered index):
10,000s of rows / sec
Worst case I ever encountered on a production system (thousands of roundtrips for thousands
rows within one transaction, poor clustered index choice and table design):
15 rows / sec
Another real-life example
Problem: Query takes 18 min to execute. Table design given (no major flaws)
Joined every table that appears in the where clause, which led to cartesian product (lots of
duplicates on to-N associations); applied "distinct" to get rid of duplicates again in resultset
Datatype conversion (e.g. datetime => varchar), prevented index application
Invoked non-deterministic user defined function on every row (results can't be cached)
Did not take advantage of existing indices (although possible)
Replaced join duplicates / distinct by subqueries, ensured index seeks, fixed non-
Query now finishes in 200 ms, speedup 5,400-fold
Slow Queries and Indices
Are indices the silver bullet? In many (trivial) cases: yes, but they can backfire on write
Indices speed up data retrieval (no need to scan every row) at the cost of additional
writes and storage space. Also provide ordering, and can help to prevent locking.
Implemented as B-Trees (self-balanced, logarithmic access time), nodes usually match
operating system I/O page size (e.g. 8k)
Consider creating indices on columns used for narrowing where clauses and applied in
group-by, order-by and join expressions, which contain selective data (e.g. there is no
sense in indexing a "gender" column with two possible values), or which are used for
referential integrity checks.
Consider creating composite indices for columns queried together. The index column order
is decisive for what can be looked up, e.g. phonebook: idx(lastname, firstname) will allow
seeking by "lastname = ... AND firstname = ...", by "lastname = ...", but not by "firstname = ...".
Multiple single-column indices in contrast require multiple separate lookups and merging the
Make your index unique if that fits your data model. This helps to furthermore optimize query
Indices should be kept small. Indexing a larger varchar column is probably not a good idea.
Indices have fill factors (used for leaving space in nodes to avoid frequent node splits),
typically between 70% (high insert rate) and 90% (low insert rate). Fill factors are applied on
index rebuilds. Index rebuilds must be scheduled by the DBA.
Each table has zero or one clustered index definition (by default: the primary key). The
clustered index is a b-tree that contains the actual row data in its leaves. If there is no clustered
index, we talk about a heap table where rows are simply appended at the end.
If the query optimizer would have to seek on an index over and over during a query, it may
decide to do one index scan instead of many index seeks.
Index seeks can not be applied on
type <> 3 -- negative search
lastname like '%...' -- '%' prepended
lastname + ' ' + firstname = '...' -- concatenation
-- col expr idx helps
CAST(FLOOR(CAST(date AS FLOAT)) AS DATETIME) > ... -- function / cast
An index contains the clustered index columns for quick lookup of actual data in clustered
index. So this is one indirection, except for...
... if an index contains all columns the query needs, the clustered index is not required for
Indices - The Drawback
Over-indexing is a problem. Indices must be written on inserts, updates, deletes, this
can cost dearly.
The choice of the clustered index is an essential factor for performance, as too many
node splits should be prevented, esp. on huge bulk inserts and updates.
Autoinc values or a growing date are good choices for clustered indices as they only
fill up the final leaf. Guids are bad as they spread all over the index.
SQL Server introduced newsequentialid() for creating sequential Guids and preventing
excessive node splitting.
Each single row insert leads to a clustered index insert and N non-clustered index inserts.
Only create indices that are absolutely necessary for query performance. Prefer one
composite index to multiple single-columns indices where applicable.
Superfast insert approach: Insert into a temporary heap table first (no indices, not even
clustered => always appended at the end), then issue an "insert-into-select" from the heap
table into the target table, ordering by target table clustered index.
Avoid join duplicates / cartesian products on to-N associations where not required for
the resultset. Often joins can be replaced by subqueries, e.g.:
where exists (select 1 from ...)
Prevent the N+1 query problem on to-N associations. Typically caused by applying OR-
mappers the wrong way, but sometimes even implemented explicitly. Never run a query
within a loop.
Keep queries simple. If a query is overly complicated, chances are its execution is
complicated too. Sometimes it's advisable to not pack everything into one single query, but
issue two or three consecutive queries. One possibility to pass data between queries is by
using temp tables.
Have a look at the execution plan and verify it looks as expected, e.g. how indices are
applied. Hint: an "index scan" is not the same as an "index seek".
Execution plans are cached per statement. But: On an expression like this
(selectiveness of a parameter varies heavily) reusing the same plan can kill performance:
where (lastname = @lastname or @lastname is null)
Query optimizer uses table statistics to choose an execution plan. Table statistics contain
metadata on column value distribution, etc. Not every column has statistic data by default,
but indices do. Statistic updates usually happen during index rebuild, or can be scheduled
by the DBA. Go sure table statistics are up to date.
Transactions, ACID and Locking
A transaction symbolizes a unit of work performed against a database,
and treated in a coherent and reliable way independent of other
There is always a transaction running. Statements without having an
explicit transaction are executed within a "single-statement" transaction.
ACID is a set of properties that guarantee that database transactions are
Locks are a means to implement ACID. Different operations require
different kinds of locks (simplified: shared (read), update (potential write),
exclusive (write)). They are acquired and released depending on the
isolation level (serializable, repeated read, read committed, read
uncommitted), and only granted if the current lock state allows for it.
Otherwise the execution blocks until the lock can be obtained. Locks are
applied on a row-, page- or table-level, and on indices.
Transactions and Lock Tuning
Keep transactions short as possible, as this reduces lock contention. Always commit or
rollback transactions immediately. Never wait for external input (worst case: waiting for user
Ensure that indices are being used. An index seek is more likely to prevent locking (row
locks can be bypassed, and index locks have much less contention).
Statements can provide specific lock hints (e.g. "with nolock") in case the default locking
behaviour can be mitigated.
As far as possible, put queries at the beginning and inserts/updates/deletes at the end.
Start with the least congested tables, and end with the most congested ones.
Deadlock prevention: Try to access resources in the same order. DBs can detect deadlocks,
and will choose one deadlock victim transaction for rollback.
The DB keeps a transaction log for rollbacks, handling ungraceful shutdowns and incremental
backups. The transaction log should be on a dedicated physical disk (separate from data
files), with an optimized setup.
Design your schema for normalization, then de-normalize for speed, e.g. for complex
join constructs on huge tables and/or a lot of aggregated data.
Radical? But what if the DB would guarantee data consistency on such de-normalized
Actually that functionality exists: Indexed Views (Materialized Views) to the rescue!
By creating a unique clustered index on a view, the view gets "materialized", having its flat
data redundantly stored to the DB. One can then add more indices to the view.
Modifications made to base tables trigger modifications in the indexed view. This leads to a
similar drawback as with indices: Indexed views are fast for queries, but come at a
performance penalty for write operations, and require additional storage space. Hint: Put
an index on the base tables' primary key columns on the indexed view for quick lookup on
updates and deletes.
Data is divided into units that can be spread across multiple nodes / filegroups /
disks. This allows more parallel processing and improves I/O performance.
The partitioned table is treated as a single logical entity when queries or updates are
A common approach is to use an autoinc primary key or a growing date column as
partition criteria. This often helps to have read and write operations occur on different data
ranges, hence different partitions.
Maintenance operations like index rebuilds or purging old data are also faster when running
on a per-partition basis.
Only makes sense for really large tables with certain data growth, and where queries are of
a kind to benefit from partitioning.
Use bulk / batch SQL statements in order to avoid unnecessary server roundtrips.
Prefer to move data within the database (e.g. temp tables, insert-into-select) instead of
back and forth from the client.
Implement and invoke stored procedures (sometimes questionable from a design
Use Activity Monitor, Profiler, Tuning Advisor, dynamic management views / dynamic
performance views and other monitoring tools.
Put data files, tempdb files and transaction logs on separate physical disks, if necessary
even single heavily-used tables.
Historically most RDBMs provided clustering mainly for failover via mirroring / data
replication. Several cluster solutions have since been extended to improve scalability as
well, e.g. Oracle RAC. On these scaling cluster systems nodes still share the same storage
(node sync requires fast cluster interconnect).
O/R-Mappers: Hibernate Tuning
Avoid join duplicates (AKA cartesian products) due to joins along two or more parallel to-
many associations; use Exists-subqueries, multiple queries or fetch="subselect"
instead - whatever is most appropriate in the specific situation. Join duplicates are already
pretty bad in plain SQL, but things get even worse when they occur within Hibernate,
because of unnecessary mapping workload and child collections containing duplicates.
Define lazy loading as the default association loading strategy, and consider applying
fetch="subselect" rather than "select" resp. "batch-size". Configure eager loading only for
special associations, but join-fetch selectively on a per-query basis.
In case of read-only services with huge query resultsets, use projections and fetch into
flat DTOs (e.g. via AliasToBean-ResultTransformer), instead of loading thousands of
mapped objects into the Session.
O/R-Mappers: Hibernate Tuning
Set ReadOnly to "true" on Queries and Criteria, when objects will never be modified.
Consider clearing the whole Session after flushing, or evict on a per-object basis, once
objects are not longer needed.
Define a suitable value for jdbc.batch_size (resp. adonet.batch_size).
Use Hibernate Query-Cache and Second Level Caching where appropriate (but go sure
you are aware of the consequences).
Set hibernate.show_sql to "false" and ensure that Hibernate logging is running at the
lowest possible loglevel (also check log4j/log4net root logger configuration).
Rules of thumb for server hardware are difficult, it depends heavily how much "hot data"
is moved around, and on query load. Do your math and plan, measure KPIs (e.g. via SQL
Server Perfcounters) and adjust accordingly.
RAM: it's cheap, get as much as you can. I/O often is a bottleneck, e.g. misconfigured
SANs can kill performance. Use HW RAID. CPU: Enterprise editions can take advantage of
as much as the OS CPU core maximum.
Let's have a look at a real life example - stackoverflow.com:
SQL Server failover cluster, 2 nodes (plus one identical setup at another data center for even
Dell R730xd server
768GB RAM (the complete data can be held in memory)
6TB PCIe SSD