SlideShare a Scribd company logo
1 of 42
This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License
Kendra Little
Introduction to SQL Server
Partitioning
About Kendra
Index
1. A sample case.
2. What is partitioning?
3. When is partitioning helpful?
4. What’s the fine print?
5. Revisiting our sample case.
You are here
Should this client use
partitioning?
Index
1. A sample case.
2. What is partitioning?
3. When is partitioning helpful?
4. What’s the fine print?
5. Revisiting our sample case.
You are here
All tables have at least one partition.
“In SQL Server, all tables and
indexes in a database are
considered partitioned, even if
they are made up of only one
partition. Essentially, partitions
form the basic unit of organization
in the physical architecture of
tables and indexes. This means
that the logical and physical
architecture of tables and indexes
comprised of multiple partitions
mirrors that of single-partition
tables and indexes.”
…Partitioned Table and Index
Concepts (msdn)
OnePartition
“Partitioning” actually means
“horizontal partitioning”
Horizontal partitioning takes
groups of rows in a single table
and allocates them in semi-
independent physical sections.
SQL Server’s horizontal
partitioning is RANGE based.
Horizontal ranges are based on a
partition key.
 A single column in the table.
 Just one!
 Use a computed column if you must, but make sure it
performs well as a criterion and works for joins.
 Typically a date or integer value
 Consider:
 A column you will join on
 A column you can always use as a criterion
I must
choose
wisely.
Ranges of data are defined by a
partition function which uses the key.
The partition function defines your boundary points and
can use either RANGE LEFT or RIGHT.
 LEFT: the first value is an UPPER boundary point in
partition #1
 RIGHT: the first value is a LOWER boundary point in
partition #2
Keep to the
right. It’s
easier.
RIGHT based partition function for
Doll Orders keyed on OrderDate
1/1/2008
1/1/2009
1/1/2010
1/1/2011
Partition 1
Partition 2
Partition 3
Partition 4
Partition 5
RIGHT based partition function
keyed on PartName (effectively LIST)
Boundary Point 1: BODY
Boundary Point 2: SHOE
Partition 1
Partition 2
Partition 3
Question: how do we get
rows into Partition 1?
Filegroups are mapped to the
partition function using a partition
scheme.
1/1/2008
1/1/2009
1/1/2010
1/1/2011
Partition 1:
Compressed
Partition 2:
Compressed
Partition 3
Partition 4
Partition 5
Slow,
Read-only
FG_A
FG_B
FG_C
FG_D
Objects are created on the partition
scheme.
Table
(and indexes)
• Created on partition scheme.
Partition
Scheme
• Maps partitions defined by the partition function to physical
filegroups
Partition
Function
• Boundary points
• Defines ranges
• Define an algorithm the engine will use to know where to put rows
Indexes can be created on the
partition scheme. Or not.
•Located on your partitioning scheme (or an identical partitioning scheme)
•Must contain the partitioning key.
•If the partitioning key is not specified, it will be added for you. Note: this
affects your primary key for the table!
•Indexes are aligned by default unless it is otherwise specified at creation time.
•Perform better for aggregations and when partition elimination can be used.
Aligned
Indexes
•Physically located elsewhere- either non partitioned or on a non-identical
partitioning scheme
•May perform better with single-record lookup
•Allow unique indexes (because they do not have to contain the partitioning
key)
•However, the presence of these preclude partition-switching!
Non-
aligned
indexes
Switching
 Requires all indexes to be aligned.
 Compatible with filtered indexes
 Data may be switched in or out only within the same
filegroup.
 Is a metadata-only operation requiring a schema
modification lock. This can be blocked by DML
operations, which require a schema stability lock.
 Is an exceptionally fast way to load or remove a large
amount of data from a table!
Creating the partition function
Our hero.
Creating filegroups
We left the
Primary FG
default on
purpose!
Creating the partition scheme
The partition scheme can map each partition to a
specific filegroup, or all partitions to the PRIMARY
filegroup. Where the
rubber
meets the
road.
Query FGs mapped to the partition
function via the partition scheme
This gets a
little
complicated.
Creating a table on the partition
scheme and add some rows.
A partitioned
heap: you
can totally
do that.
Let’s have a look at that heap.
We’ll use this
query again, but
not show it on
every slide for
obvious reasons.
Adding indexes
Someone’s not
in line.
Notice that aligned indexes always
have the clustering key
That’s not
usually there!
Adding another partition
We now have
a full staging
table and
empty
partition on
dailyFG4
Switching in!
Don’t forget to drop
ordersDaily20101230:
your staging table is
still there, it’s just
empty now.
And you’re gonna
have to rebuild that
non-aligned NC if you
want it back.
Index
1. A sample case.
2. What is partitioning?
3. When is partitioning helpful?
4. What’s the fine print?
5. Revisiting our sample case.
You are here
Is maintenance a significant problem
for availability?
YES
• Partitioning may be what you
are looking for.
• Keep checking other factors.
NO
• You may have other reasons
to partition, but one of its big
benefits is to help with this.
Maintenance
includes index
rebuilds,
loading data,
and deleting
data.
Are query patterns defined by
regions?
YES
• Finding regions of data which are
queried together and have a good
partitioning key is important to good
query performance.
• This is the basis of partition elimination.
NO
• You may not have a good partitioning
key.
• Keep looking at the query patterns for
your workload and evaluating different
partitioning keys.
Data
regions may
be dates,
integers,
codes
Can applications and queries be
optimized for partitioning?
YES
• This means you will be able to
rewrite some queries and
procedures as needed to take
advantage of partition elimination.
NO
• If you do not have the ability to
tune user and application queries,
some will likely perform very poorly.
Some
assembly
required.
Do you have resources to support
the partitioned system?
• Can your disk configuration be optimized?
• Is enough buffer pool available for what
will need to be read into memory
concurrently?
• Will you be able to tune and configure
parallelism appropriately for the workload?
• Do you have a system you can test with a
production-like workload, or a suitable
rollback plan?
Index
1. A sample case.
2. What is partitioning?
3. When is partitioning helpful?
4. What’s the fine print?
5. Revisiting our sample case.
You are here
Editions with partitioning
Enterprise Datacenter Developer Evaluation
Support for HOW MANY partitions?
 15,000 partitions are available in SQL 2008 with SP2
applied
 SQL Server 2005, 2008, and 2008 R2 (for now) are
limited to 1,000 partitions. This is less than 3 years for
daily partitioning.
What problems
could happen with
lots of partitions?
Parallelism
 In 2005, a query touching more than one partition
typically had only one thread per partition.
 In 2008, the Partitioned Table Parallelism
improvement allows multiple threads to be used on
each partition for parallel plans.
Partition
1! Partition
1!
Partition
2!
Partition
2!
Partition
3!
Partition
3!
Lock escalation AUTO
 Lock escalation can be set to AUTO for a table. If the
table is partitioned, locks will escalate to the partition
level rather than the table level.
 What’s awesome: greater concurrency!
Partition level deadlocks
are not awesome. Test
your workload (like with
any feature).
Partition aware seeks
 In SQL 2008, the optimizer has been made more
clever and has a greater chance at achieving partition
elimination. This has been done by:
 Changing the internal representation of a partitioned
table to be more optimized for seeking on the
PartitionID (even when the table’s CX is on another
column)
 A “skip scan” operation has been added to allow the
optimizer greater flexibility.
More optimized optimizin.
Be careful with your statistics
 Statistics are not maintained per partition, they are
maintained for the entire index or column. Since there
is a limit to the number of steps in the histogram, the
statistics can become invalid, and on very large tables
may take a long time to update.
 Filtered statistics can be used to help with this in
2008: you can create new filtered statistics for your
new partition.
This sounds like work.
Index rebuilds and compression
 Individual partitions cannot be rebuilt online.
 The entirety of a partitioned index can be rebuilt
online.
 Individual partitions can be compressed.
For fact tables with archive data, older partitions can be
be rebuilt once with compression. Their filegroups can
then be made read-only.
I’d better check my
maintenance jobs.
Switching Feature Compatibility
 Works with replication in 2008 and later
 Some subscribers can have the partitioning scheme,
others don’t have to
 This means you can have some subscribers on Standard.
 Works with Change Data Capture (with some special
steps)
 Does not work with Change Tracking
@SQLFool replicates her
partitioned tables, check
out her blog.
Index
1. A sample case.
2. What is partitioning?
3. When is partitioning helpful?
4. What’s the fine print?
5. Revisiting our sample case. You are here
So, should this client use
partitioning?
This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License
Resources/ Contact
There is a very large amount of documentation online
for horizontal table partitioning. Get my
recommendations here:
http://littlekendra.com/resources/partition/
This presentation would not have been possibly
without whitepapers and blogs by Kimberly Tripp,
Michelle Ufford, and Ron Talmage.
• Twitter: @kendra_little
• Email: littlekendra@gmail.com
• LinkedIn: http://www.linkedin.com/in/kendralittle

More Related Content

Similar to Introduction to SQL Server Partitioning

Obiee11g working with partitions
Obiee11g working with partitionsObiee11g working with partitions
Obiee11g working with partitionsAmit Sharma
 
PostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / ShardingPostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / ShardingAmir Reza Hashemi
 
Geek Sync | Why Did My Clever Index Change Backfire?
Geek Sync | Why Did My Clever Index Change Backfire?Geek Sync | Why Did My Clever Index Change Backfire?
Geek Sync | Why Did My Clever Index Change Backfire?IDERA Software
 
Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…Aaron Shilo
 
Citus Architecture: Extending Postgres to Build a Distributed Database
Citus Architecture: Extending Postgres to Build a Distributed DatabaseCitus Architecture: Extending Postgres to Build a Distributed Database
Citus Architecture: Extending Postgres to Build a Distributed DatabaseOzgun Erdogan
 
You Oracle Technical Interview
You Oracle Technical InterviewYou Oracle Technical Interview
You Oracle Technical InterviewHossam El-Faxe
 
Sql server lesson7
Sql server lesson7Sql server lesson7
Sql server lesson7Ala Qunaibi
 
Comprehensive oracle interview qs
Comprehensive oracle interview qsComprehensive oracle interview qs
Comprehensive oracle interview qsvipul_wankar
 
Obiee metadata development
Obiee metadata developmentObiee metadata development
Obiee metadata developmentdils4u
 
Teched03 Index Maint Tony Bain
Teched03 Index Maint Tony BainTeched03 Index Maint Tony Bain
Teched03 Index Maint Tony BainTony Bain
 
dotnetMALAGA - Sql query tuning guidelines
dotnetMALAGA - Sql query tuning guidelinesdotnetMALAGA - Sql query tuning guidelines
dotnetMALAGA - Sql query tuning guidelinesJavier García Magna
 
Mohan Testing
Mohan TestingMohan Testing
Mohan Testingsmittal81
 
Tableau Basic Questions
Tableau Basic QuestionsTableau Basic Questions
Tableau Basic QuestionsSooraj Vinodan
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008paulguerin
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performanceguest9912e5
 
22827361 ab initio-fa-qs
22827361 ab initio-fa-qs22827361 ab initio-fa-qs
22827361 ab initio-fa-qsCapgemini
 
Optimizing Data Accessin Sq Lserver2005
Optimizing Data Accessin Sq Lserver2005Optimizing Data Accessin Sq Lserver2005
Optimizing Data Accessin Sq Lserver2005rainynovember12
 
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Componenthbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index ComponentMichael Stack
 

Similar to Introduction to SQL Server Partitioning (20)

Obiee11g working with partitions
Obiee11g working with partitionsObiee11g working with partitions
Obiee11g working with partitions
 
PostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / ShardingPostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / Sharding
 
Geek Sync | Why Did My Clever Index Change Backfire?
Geek Sync | Why Did My Clever Index Change Backfire?Geek Sync | Why Did My Clever Index Change Backfire?
Geek Sync | Why Did My Clever Index Change Backfire?
 
Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…Getting to know oracle database objects iot, mviews, clusters and more…
Getting to know oracle database objects iot, mviews, clusters and more…
 
Citus Architecture: Extending Postgres to Build a Distributed Database
Citus Architecture: Extending Postgres to Build a Distributed DatabaseCitus Architecture: Extending Postgres to Build a Distributed Database
Citus Architecture: Extending Postgres to Build a Distributed Database
 
You Oracle Technical Interview
You Oracle Technical InterviewYou Oracle Technical Interview
You Oracle Technical Interview
 
Sql server lesson7
Sql server lesson7Sql server lesson7
Sql server lesson7
 
Comprehensive oracle interview qs
Comprehensive oracle interview qsComprehensive oracle interview qs
Comprehensive oracle interview qs
 
Obiee metadata development
Obiee metadata developmentObiee metadata development
Obiee metadata development
 
Data warehouse physical design
Data warehouse physical designData warehouse physical design
Data warehouse physical design
 
Teched03 Index Maint Tony Bain
Teched03 Index Maint Tony BainTeched03 Index Maint Tony Bain
Teched03 Index Maint Tony Bain
 
dotnetMALAGA - Sql query tuning guidelines
dotnetMALAGA - Sql query tuning guidelinesdotnetMALAGA - Sql query tuning guidelines
dotnetMALAGA - Sql query tuning guidelines
 
Mohan Testing
Mohan TestingMohan Testing
Mohan Testing
 
Tableau Basic Questions
Tableau Basic QuestionsTableau Basic Questions
Tableau Basic Questions
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance
 
Designing data intensive applications
Designing data intensive applicationsDesigning data intensive applications
Designing data intensive applications
 
22827361 ab initio-fa-qs
22827361 ab initio-fa-qs22827361 ab initio-fa-qs
22827361 ab initio-fa-qs
 
Optimizing Data Accessin Sq Lserver2005
Optimizing Data Accessin Sq Lserver2005Optimizing Data Accessin Sq Lserver2005
Optimizing Data Accessin Sq Lserver2005
 
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Componenthbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Recently uploaded (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Introduction to SQL Server Partitioning

  • 1. This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License Kendra Little Introduction to SQL Server Partitioning
  • 3. Index 1. A sample case. 2. What is partitioning? 3. When is partitioning helpful? 4. What’s the fine print? 5. Revisiting our sample case. You are here
  • 4. Should this client use partitioning?
  • 5. Index 1. A sample case. 2. What is partitioning? 3. When is partitioning helpful? 4. What’s the fine print? 5. Revisiting our sample case. You are here
  • 6. All tables have at least one partition. “In SQL Server, all tables and indexes in a database are considered partitioned, even if they are made up of only one partition. Essentially, partitions form the basic unit of organization in the physical architecture of tables and indexes. This means that the logical and physical architecture of tables and indexes comprised of multiple partitions mirrors that of single-partition tables and indexes.” …Partitioned Table and Index Concepts (msdn) OnePartition
  • 7. “Partitioning” actually means “horizontal partitioning” Horizontal partitioning takes groups of rows in a single table and allocates them in semi- independent physical sections. SQL Server’s horizontal partitioning is RANGE based.
  • 8. Horizontal ranges are based on a partition key.  A single column in the table.  Just one!  Use a computed column if you must, but make sure it performs well as a criterion and works for joins.  Typically a date or integer value  Consider:  A column you will join on  A column you can always use as a criterion I must choose wisely.
  • 9. Ranges of data are defined by a partition function which uses the key. The partition function defines your boundary points and can use either RANGE LEFT or RIGHT.  LEFT: the first value is an UPPER boundary point in partition #1  RIGHT: the first value is a LOWER boundary point in partition #2 Keep to the right. It’s easier.
  • 10. RIGHT based partition function for Doll Orders keyed on OrderDate 1/1/2008 1/1/2009 1/1/2010 1/1/2011 Partition 1 Partition 2 Partition 3 Partition 4 Partition 5
  • 11. RIGHT based partition function keyed on PartName (effectively LIST) Boundary Point 1: BODY Boundary Point 2: SHOE Partition 1 Partition 2 Partition 3 Question: how do we get rows into Partition 1?
  • 12. Filegroups are mapped to the partition function using a partition scheme. 1/1/2008 1/1/2009 1/1/2010 1/1/2011 Partition 1: Compressed Partition 2: Compressed Partition 3 Partition 4 Partition 5 Slow, Read-only FG_A FG_B FG_C FG_D
  • 13. Objects are created on the partition scheme. Table (and indexes) • Created on partition scheme. Partition Scheme • Maps partitions defined by the partition function to physical filegroups Partition Function • Boundary points • Defines ranges • Define an algorithm the engine will use to know where to put rows
  • 14. Indexes can be created on the partition scheme. Or not. •Located on your partitioning scheme (or an identical partitioning scheme) •Must contain the partitioning key. •If the partitioning key is not specified, it will be added for you. Note: this affects your primary key for the table! •Indexes are aligned by default unless it is otherwise specified at creation time. •Perform better for aggregations and when partition elimination can be used. Aligned Indexes •Physically located elsewhere- either non partitioned or on a non-identical partitioning scheme •May perform better with single-record lookup •Allow unique indexes (because they do not have to contain the partitioning key) •However, the presence of these preclude partition-switching! Non- aligned indexes
  • 15. Switching  Requires all indexes to be aligned.  Compatible with filtered indexes  Data may be switched in or out only within the same filegroup.  Is a metadata-only operation requiring a schema modification lock. This can be blocked by DML operations, which require a schema stability lock.  Is an exceptionally fast way to load or remove a large amount of data from a table!
  • 16. Creating the partition function Our hero.
  • 17. Creating filegroups We left the Primary FG default on purpose!
  • 18. Creating the partition scheme The partition scheme can map each partition to a specific filegroup, or all partitions to the PRIMARY filegroup. Where the rubber meets the road.
  • 19. Query FGs mapped to the partition function via the partition scheme This gets a little complicated.
  • 20. Creating a table on the partition scheme and add some rows. A partitioned heap: you can totally do that.
  • 21. Let’s have a look at that heap. We’ll use this query again, but not show it on every slide for obvious reasons.
  • 23. Notice that aligned indexes always have the clustering key That’s not usually there!
  • 24. Adding another partition We now have a full staging table and empty partition on dailyFG4
  • 25. Switching in! Don’t forget to drop ordersDaily20101230: your staging table is still there, it’s just empty now. And you’re gonna have to rebuild that non-aligned NC if you want it back.
  • 26. Index 1. A sample case. 2. What is partitioning? 3. When is partitioning helpful? 4. What’s the fine print? 5. Revisiting our sample case. You are here
  • 27. Is maintenance a significant problem for availability? YES • Partitioning may be what you are looking for. • Keep checking other factors. NO • You may have other reasons to partition, but one of its big benefits is to help with this. Maintenance includes index rebuilds, loading data, and deleting data.
  • 28. Are query patterns defined by regions? YES • Finding regions of data which are queried together and have a good partitioning key is important to good query performance. • This is the basis of partition elimination. NO • You may not have a good partitioning key. • Keep looking at the query patterns for your workload and evaluating different partitioning keys. Data regions may be dates, integers, codes
  • 29. Can applications and queries be optimized for partitioning? YES • This means you will be able to rewrite some queries and procedures as needed to take advantage of partition elimination. NO • If you do not have the ability to tune user and application queries, some will likely perform very poorly. Some assembly required.
  • 30. Do you have resources to support the partitioned system? • Can your disk configuration be optimized? • Is enough buffer pool available for what will need to be read into memory concurrently? • Will you be able to tune and configure parallelism appropriately for the workload? • Do you have a system you can test with a production-like workload, or a suitable rollback plan?
  • 31. Index 1. A sample case. 2. What is partitioning? 3. When is partitioning helpful? 4. What’s the fine print? 5. Revisiting our sample case. You are here
  • 32. Editions with partitioning Enterprise Datacenter Developer Evaluation
  • 33. Support for HOW MANY partitions?  15,000 partitions are available in SQL 2008 with SP2 applied  SQL Server 2005, 2008, and 2008 R2 (for now) are limited to 1,000 partitions. This is less than 3 years for daily partitioning. What problems could happen with lots of partitions?
  • 34. Parallelism  In 2005, a query touching more than one partition typically had only one thread per partition.  In 2008, the Partitioned Table Parallelism improvement allows multiple threads to be used on each partition for parallel plans. Partition 1! Partition 1! Partition 2! Partition 2! Partition 3! Partition 3!
  • 35. Lock escalation AUTO  Lock escalation can be set to AUTO for a table. If the table is partitioned, locks will escalate to the partition level rather than the table level.  What’s awesome: greater concurrency! Partition level deadlocks are not awesome. Test your workload (like with any feature).
  • 36. Partition aware seeks  In SQL 2008, the optimizer has been made more clever and has a greater chance at achieving partition elimination. This has been done by:  Changing the internal representation of a partitioned table to be more optimized for seeking on the PartitionID (even when the table’s CX is on another column)  A “skip scan” operation has been added to allow the optimizer greater flexibility. More optimized optimizin.
  • 37. Be careful with your statistics  Statistics are not maintained per partition, they are maintained for the entire index or column. Since there is a limit to the number of steps in the histogram, the statistics can become invalid, and on very large tables may take a long time to update.  Filtered statistics can be used to help with this in 2008: you can create new filtered statistics for your new partition. This sounds like work.
  • 38. Index rebuilds and compression  Individual partitions cannot be rebuilt online.  The entirety of a partitioned index can be rebuilt online.  Individual partitions can be compressed. For fact tables with archive data, older partitions can be be rebuilt once with compression. Their filegroups can then be made read-only. I’d better check my maintenance jobs.
  • 39. Switching Feature Compatibility  Works with replication in 2008 and later  Some subscribers can have the partitioning scheme, others don’t have to  This means you can have some subscribers on Standard.  Works with Change Data Capture (with some special steps)  Does not work with Change Tracking @SQLFool replicates her partitioned tables, check out her blog.
  • 40. Index 1. A sample case. 2. What is partitioning? 3. When is partitioning helpful? 4. What’s the fine print? 5. Revisiting our sample case. You are here
  • 41. So, should this client use partitioning?
  • 42. This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License Resources/ Contact There is a very large amount of documentation online for horizontal table partitioning. Get my recommendations here: http://littlekendra.com/resources/partition/ This presentation would not have been possibly without whitepapers and blogs by Kimberly Tripp, Michelle Ufford, and Ron Talmage. • Twitter: @kendra_little • Email: littlekendra@gmail.com • LinkedIn: http://www.linkedin.com/in/kendralittle

Editor's Notes

  1. http://www.flickr.com/photos/lara604/3163793777/sizes/l/
  2. 10 years working with SQL Server Started with tsql queries for reporting and analytics then lab administration then database administration Some systems engineering expertise, but mostly specialize in SQL Server layer and above. Major interests: Performance Tuning and Reporting Minor interests: Process, Communication Patterns
  3. http://www.flickr.com/photos/lara604/3163790771/sizes/l/ Sample client has a reporting system containing both fact and dimension tables. Fact tables are up to 300GB in size (including all indexes) with up to 1 billion rows. Dimension tables are up to 200GB in size (including all indexes) with up to 200 million rows. A middle tier application has been designed to dynamically create and execute queries to run reports custom-designed by clients.
  4. As of SQL 2005, everything is partitioned! “The key to determining whether a table is partitioned is the table (or index) data_space_id in the sys.indexes catalog view, and whether it has an associated partition scheme in the sys.data_spaces catalog view. All tables that are placed on a partition scheme will have 'PS ' (for partition scheme) as the type for their data_space_id in sys.data_spaces.” from Ron Talmage: “Partitioned Table and Index Strategies Using SQL Server 2008” http://msdn.microsoft.com/en-us/library/dd578580.aspx
  5. Horizontal partitioning takes groups of rows in a single table and allocates them in semi-independent physical sections. Forms of partitioning in other products include: List Partitioning: an explicit list of key values is specified for each partition. (Postgres, MySQL, Oracle)  Note: this can be effectively done in SQL Server with RANGE. Hash Partitioning: a function is defined with an expression that evaluates values in rows to be inserted in the table. (MySQL, Oracle) Interval Partitioning: similar to range, but new partitions are automatically created (Oracle) Composite Partitioning: combinations of the above (Oracle)
  6. From “Partitioned Tables and Indexes in SQL Server 2005” (http://msdn.microsoft.com/en-us/library/ms345146(SQL.90).aspx) Note   Using the datetime data type does add a bit of complexity here, but you need to make sure you set up the correct boundary cases. Notice the simplicity with RIGHT because the default time is 12:00:00.000 A.M. For LEFT, the added complexity is due to the precision of the datetime data type. The reason that 23:59:59.997 MUST be chosen is that datetime data does not guarantee precision to the millisecond. Instead, datetime data is precise within 3.33 milliseconds. In the case of 23:59:59.999, this exact time tick is not available and instead the value is rounded to the nearest time tick that is 12:00:00.000 A.M. of the following day. With this rounding, the boundaries will not be defined properly. For datetime data, you must use caution with specifically supplied millisecond values. Note   Partitioning functions also allow functions as part of the partition function definition. You may use DATEADD(ms,-3,'20010101') instead of explicitly defining the time using '20001231 23:59:59.997'.
  7. Each doll represents 100 million rows, each recording an order
  8. http://www.flickr.com/photos/lara604/3163790401/sizes/l/in/photostream/ From “Partitioned Tables and Indexes in SQL Server 2005” (http://msdn.microsoft.com/en-us/library/ms345146(SQL.90).aspx) Note   Using the datetime data type does add a bit of complexity here, but you need to make sure you set up the correct boundary cases. Notice the simplicity with RIGHT because the default time is 12:00:00.000 A.M. For LEFT, the added complexity is due to the precision of the datetime data type. The reason that 23:59:59.997 MUST be chosen is that datetime data does not guarantee precision to the millisecond. Instead, datetime data is precise within 3.33 milliseconds. In the case of 23:59:59.999, this exact time tick is not available and instead the value is rounded to the nearest time tick that is 12:00:00.000 A.M. of the following day. With this rounding, the boundaries will not be defined properly. For datetime data, you must use caution with specifically supplied millisecond values. Note   Partitioning functions also allow functions as part of the partition function definition. You may use DATEADD(ms,-3,'20010101') instead of explicitly defining the time using '20001231 23:59:59.997'.
  9. http://msdn.microsoft.com/en-us/library/ms191160.aspx
  10. Query largely from Ron Talmage: “Partitioned Table and Index Strategies Using SQL Server 2008” http://msdn.microsoft.com/en-us/library/dd578580.aspx
  11. Note that the partition scheme isn’t specified– it defaults
  12. http://msdn.microsoft.com/en-us/library/ms177411.aspx “If you frequently run queries that involve an equi-join between two or more partitioned tables, their partitioning columns should be the same as the columns on which the tables are joined. Additionally, the tables, or their indexes, should be collocated. This means that they either use the same named partition function, or they use different ones that are essentially the same, in that they: Have the same number of parameters that are used for partitioning, and the corresponding parameters are the same data types. Define the same number of partitions. Define the same boundary values for partitions. In this way, the SQL Server query optimizer can process the join faster, because the partitions themselves can be joined. If a query joins two tables that are not collocated or are not partitioned on the join field, the presence of partitions may actually slow down query processing instead of accelerate it. “
  13. Why so many? If you are using daily partitioning for a fact table, 1K partitions limits you to less than three years. Warning: large amounts of filegroups can affect recovery time. See http://blogs.msdn.com/b/sqlserverstorageengine/archive/2007/04/22/how-having-too-many-filegroups-can-affect-recovery-time.aspx?wa=wsignin1.0
  14. http://sqlskills.com/BLOGS/PAUL/post/SQL-Server-2008-Partition-level-lock-escalation-details-and-examples.aspx
  15. http://msdn.microsoft.com/en-us/library/ms345599.aspx “In SQL Server 2008, the internal representation of a partitioned table is changed so that the table appears to the query processor to be a multicolumn index with PartitionID as the leading column. PartitionID is a hidden computed column used internally to represent the ID of the partition containing a specific row. For example, assume the table T, defined as T(a, b, c), is partitioned on column a, and has a clustered index on column b. In SQL Server 2008, this partitioned table is treated internally as a nonpartitioned table with the schemaT(PartitionID, a, b, c) and a clustered index on the composite key (PartitionID, b). This allows the query optimizer to perform seek operations based on PartitionID on any partitioned table or index. Partition elimination is now done in this seek operation. In addition, the query optimizer is extended so that a seek or scan operation with one condition can be done on PartitionID (as the logical leading column) and possibly other index key columns, and then a second-level seek, with a different condition, can be done on one or more additional columns, for each distinct value that meets the qualification for the first-level seek operation. That is, this operation, called a skip scan, allows the query optimizer to perform a seek or scan operation based on one condition to determine the partitions to be accessed and a second-level index seek operation within that operator to return rows from these partitions that meet a different condition. “
  16. http://sqlcat.com/msdnmirror/archive/2009/10/20/using-filtered-statistics-with-partitioned-tables.aspx
  17. http://blogs.msdn.com/b/sqlserverstorageengine/archive/2010/02/03/performance-improvement-by-orders-of-magnitude-when-merging-partitions-in-sql-server-2008r2.aspx