Introduction to SQL Server Partitioning

This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License
Kendra Little
Introduction to SQL Server
Partitioning

Index
1. A sample case.
2. What is partitioning?
3. When is partitioning helpful?
4. What’s the fine print?
5. Revisiting our sample case.
You are here

Should this client use
partitioning?

All tables have at least one partition.
“In SQL Server, all tables and
indexes in a database are
considered partitioned, even if
they are made up of only one
partition. Essentially, partitions
form the basic unit of organization
in the physical architecture of
tables and indexes. This means
that the logical and physical
architecture of tables and indexes
comprised of multiple partitions
mirrors that of single-partition
tables and indexes.”
…Partitioned Table and Index
Concepts (msdn)
OnePartition

“Partitioning” actually means
“horizontal partitioning”
Horizontal partitioning takes
groups of rows in a single table
and allocates them in semi-
independent physical sections.
SQL Server’s horizontal
partitioning is RANGE based.

Horizontal ranges are based on a
partition key.
 A single column in the table.
 Just one!
 Use a computed column if you must, but make sure it
performs well as a criterion and works for joins.
 Typically a date or integer value
 Consider:
 A column you will join on
 A column you can always use as a criterion
I must
choose
wisely.

Ranges of data are defined by a
partition function which uses the key.
The partition function defines your boundary points and
can use either RANGE LEFT or RIGHT.
 LEFT: the first value is an UPPER boundary point in
partition #1
 RIGHT: the first value is a LOWER boundary point in
partition #2
Keep to the
right. It’s
easier.

RIGHT based partition function for
Doll Orders keyed on OrderDate
1/1/2008
1/1/2009
1/1/2010
1/1/2011
Partition 1
Partition 2
Partition 3
Partition 4
Partition 5

RIGHT based partition function
keyed on PartName (effectively LIST)
Boundary Point 1: BODY
Boundary Point 2: SHOE
Partition 1
Partition 2
Partition 3
Question: how do we get
rows into Partition 1?

Filegroups are mapped to the
partition function using a partition
scheme.
1/1/2008
1/1/2009
1/1/2010
1/1/2011
Partition 1:
Compressed
Partition 2:
Compressed
Partition 3
Partition 4
Partition 5
Slow,
Read-only
FG_A
FG_B
FG_C
FG_D

Objects are created on the partition
scheme.
Table
(and indexes)
• Created on partition scheme.
Partition
Scheme
• Maps partitions defined by the partition function to physical
filegroups
Partition
Function
• Boundary points
• Defines ranges
• Define an algorithm the engine will use to know where to put rows

Indexes can be created on the
partition scheme. Or not.
•Located on your partitioning scheme (or an identical partitioning scheme)
•Must contain the partitioning key.
•If the partitioning key is not specified, it will be added for you. Note: this
affects your primary key for the table!
•Indexes are aligned by default unless it is otherwise specified at creation time.
•Perform better for aggregations and when partition elimination can be used.
Aligned
Indexes
•Physically located elsewhere- either non partitioned or on a non-identical
partitioning scheme
•May perform better with single-record lookup
•Allow unique indexes (because they do not have to contain the partitioning
key)
•However, the presence of these preclude partition-switching!
Non-
aligned
indexes

Switching
 Requires all indexes to be aligned.
 Compatible with filtered indexes
 Data may be switched in or out only within the same
filegroup.
 Is a metadata-only operation requiring a schema
modification lock. This can be blocked by DML
operations, which require a schema stability lock.
 Is an exceptionally fast way to load or remove a large
amount of data from a table!

Creating the partition function
Our hero.

Creating filegroups
We left the
Primary FG
default on
purpose!

Creating the partition scheme
The partition scheme can map each partition to a
specific filegroup, or all partitions to the PRIMARY
filegroup. Where the
rubber
meets the
road.

Query FGs mapped to the partition
function via the partition scheme
This gets a
little
complicated.

Creating a table on the partition
scheme and add some rows.
A partitioned
heap: you
can totally
do that.

Let’s have a look at that heap.
We’ll use this
query again, but
not show it on
every slide for
obvious reasons.

Adding indexes
Someone’s not
in line.

Notice that aligned indexes always
have the clustering key
That’s not
usually there!

Adding another partition
We now have
a full staging
table and
empty
partition on
dailyFG4

Switching in!
Don’t forget to drop
ordersDaily20101230:
your staging table is
still there, it’s just
empty now.
And you’re gonna
have to rebuild that
non-aligned NC if you
want it back.

Is maintenance a significant problem
for availability?
YES
• Partitioning may be what you
are looking for.
• Keep checking other factors.
NO
• You may have other reasons
to partition, but one of its big
benefits is to help with this.
Maintenance
includes index
rebuilds,
loading data,
and deleting
data.

Are query patterns defined by
regions?
YES
• Finding regions of data which are
queried together and have a good
partitioning key is important to good
query performance.
• This is the basis of partition elimination.
NO
• You may not have a good partitioning
key.
• Keep looking at the query patterns for
your workload and evaluating different
partitioning keys.
Data
regions may
be dates,
integers,
codes

Can applications and queries be
optimized for partitioning?
YES
• This means you will be able to
rewrite some queries and
procedures as needed to take
advantage of partition elimination.
NO
• If you do not have the ability to
tune user and application queries,
some will likely perform very poorly.
Some
assembly
required.

Do you have resources to support
the partitioned system?
• Can your disk configuration be optimized?
• Is enough buffer pool available for what
will need to be read into memory
concurrently?
• Will you be able to tune and configure
parallelism appropriately for the workload?
• Do you have a system you can test with a
production-like workload, or a suitable
rollback plan?

Editions with partitioning
Enterprise Datacenter Developer Evaluation

Support for HOW MANY partitions?
 15,000 partitions are available in SQL 2008 with SP2
applied
 SQL Server 2005, 2008, and 2008 R2 (for now) are
limited to 1,000 partitions. This is less than 3 years for
daily partitioning.
What problems
could happen with
lots of partitions?

Parallelism
 In 2005, a query touching more than one partition
typically had only one thread per partition.
 In 2008, the Partitioned Table Parallelism
improvement allows multiple threads to be used on
each partition for parallel plans.
Partition
1! Partition
1!
Partition
2!
Partition
2!
Partition
3!
Partition
3!

Lock escalation AUTO
 Lock escalation can be set to AUTO for a table. If the
table is partitioned, locks will escalate to the partition
level rather than the table level.
 What’s awesome: greater concurrency!
Partition level deadlocks
are not awesome. Test
your workload (like with
any feature).

Partition aware seeks
 In SQL 2008, the optimizer has been made more
clever and has a greater chance at achieving partition
elimination. This has been done by:
 Changing the internal representation of a partitioned
table to be more optimized for seeking on the
PartitionID (even when the table’s CX is on another
column)
 A “skip scan” operation has been added to allow the
optimizer greater flexibility.
More optimized optimizin.

Be careful with your statistics
 Statistics are not maintained per partition, they are
maintained for the entire index or column. Since there
is a limit to the number of steps in the histogram, the
statistics can become invalid, and on very large tables
may take a long time to update.
 Filtered statistics can be used to help with this in
2008: you can create new filtered statistics for your
new partition.
This sounds like work.

Index rebuilds and compression
 Individual partitions cannot be rebuilt online.
 The entirety of a partitioned index can be rebuilt
online.
 Individual partitions can be compressed.
For fact tables with archive data, older partitions can be
be rebuilt once with compression. Their filegroups can
then be made read-only.
I’d better check my
maintenance jobs.

Switching Feature Compatibility
 Works with replication in 2008 and later
 Some subscribers can have the partitioning scheme,
others don’t have to
 This means you can have some subscribers on Standard.
 Works with Change Data Capture (with some special
steps)
 Does not work with Change Tracking
@SQLFool replicates her
partitioned tables, check
out her blog.

Index
1. A sample case.
2. What is partitioning?
3. When is partitioning helpful?
4. What’s the fine print?
5. Revisiting our sample case. You are here

So, should this client use
partitioning?

This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License
Resources/ Contact
There is a very large amount of documentation online
for horizontal table partitioning. Get my
recommendations here:
http://littlekendra.com/resources/partition/
This presentation would not have been possibly
without whitepapers and blogs by Kimberly Tripp,
Michelle Ufford, and Ron Talmage.
• Twitter: @kendra_little
• Email: littlekendra@gmail.com
• LinkedIn: http://www.linkedin.com/in/kendralittle

Introduction to SQL Server Partitioning

Recommended

Recommended

More Related Content

Similar to Introduction to SQL Server Partitioning

Similar to Introduction to SQL Server Partitioning (20)

Recently uploaded

Recently uploaded (20)

Introduction to SQL Server Partitioning

Editor's Notes