IBM DB2 for Linux, UNIX, and Windows Best Practices Physical ...Document Transcript
IBM® DB2® for Linux®, UNIX®, and Windows®
Physical Database Design
Program Director and Senior Technical
Information Management Software
Executive IT Specialist
Information Management Technical Sales
DB2 Information Development
Information Management Technical Sales
Physical Database Design Page 2
Physical Database Design ................................................................................... 1
Executive summary ............................................................................................. 4
Introduction to physical database design......................................................... 6
Assumptions about the reader..................................................................... 7
Goals of physical database design..................................................................... 8
Datatype selection best practices ....................................................................... 9
Example of virtual views that represent a lookup table for each column
Table normalization and denormalization best practices ............................ 12
Third normal form (3NF)..............................................................................................12
1NF, 2NF, and 3NF of database design ......................................................................13
Star schema and snowflake models ............................................................................15
IBM Layered Data Architecture................................................................. 15
Index design best practices ............................................................................... 18
Clustering indexes ....................................................................................... 18
Data clustering and multidimensional clustering (MDC) best practices... 23
Block indexes for MDC tables .................................................................... 23
Maintaining clustering automatically during INSERT operations....... 25
Benefits of using MDC ................................................................................ 26
MDC storage scenario ................................................................................. 29
MDC run time overhead and benefit considerations ............................. 30
Determining when to use MDC versus a clustering index.................... 30
Database partitioning (shared-nothing hash partitioning) best practices . 34
Balanced Warehouse and Balanced Configuration Units (BCU).......... 35
Table (range) partitioning best practices ........................................................ 39
UNION All View (UAV) partitioning best practices.................................... 42
Physical Database Design Page 3
Migrating UAVs to table partitioning....................................................... 43
Database partitioning, table partitioning, and MDC in the same database
design best practices .......................................................................................... 45
Roll-in and roll-out of data with table partitioning and MDC best practices
Rolling-in large data volumes using table partitioning best practices....... 47
Materialized query table (MQT) best practices ............................................. 48
Post-design tools for improving designs for existing databases................. 51
Explain facility best practices ..................................................................... 51
DB2 Design Advisor best practices ........................................................... 52
MDC selection capability of the DB2 Design Advisor..............................................53
Best Practices....................................................................................................... 55
Conclusion .......................................................................................................... 58
Further reading................................................................................................... 59
Notices ................................................................................................................. 61
Trademarks ................................................................................................... 62
Physical Database Design Page 4
Physical database design is the single most important factor that impacts database
performance. Physical database design covers all of the design features that relate to the
physical structure of the database such as datatype selection, table normalization and
denormalization, indexes, materialized views, data clustering, multidimensional data
clustering, table (range) partitioning, and database (hash) partitioning.
Good physical database design reduces hardware resource utilization (I/O, CPU, and
network) and improves your administrative efficiency. This, in turn, can help you
achieve the following potential benefits to your business:
• Increased performance of applications that use the database, resulting in better
response times and higher end-user satisfaction
• Reduced IT administrative costs, giving you the ability to manage a wider scope
of databases and respond quicker to changes in application requirements
• Reduced IT hardware costs
• Improved backup and recovery elapsed time
Figure 1 shows an illustration of a physical database system. The three heavy dark-boxed
vertical rectangles indicate three distinct database instances. All other square or
rectangular boxes represent storage blocks on disk. All symbols represent data values
within the table (such as geography or month).
In this example, a table has been hash-partitioned across three instances called P1, P2,
and P3. The table has been range-partitioned by month, allowing data to be easily added
and deleted by month. Indirectly, this also helps with queries that have predicates by
month. Data within each table has been clustered using multidimensional clustering
(MDC), and this serves as a further clustering within each range partition. The rows
within the table are also indexed using regular row-based (RID-based) indexes. A
materialized query table (MQT) is created on the table, which includes aggregated data
(such as average sales by geography), which itself has indexing and MDC.
Physical Database Design Page 5
Figure 1 Illustration of a physical database system
Physical Database Design Page 6
Introduction to physical database design
Database design is performed in three stages:
1. Logical database design: includes gathering of business requirements, and entity
2. Conversion of the logical design into table definitions (often performed by an
application developer): includes pre-deployment design, table definitions,
normalization, PK and FK relationships, and basic indexing.
3. Post deployment physical database design (often performed by a database
administrator): includes improving performance, reducing I/O, and streamlining
Physical database design covers those aspects of database design that impact the actual
structure of the database on disk, items 21 and 3 in the list above. Although you can
perform logical design independently of the platform that the database will eventually
use, many physical database attributes depend on the specifics and semantics of the
target DBMS. Physical database design includes the following attributes:
• Datatype selection
• Table normalization
• Table denormalization
• Database partitioning
• Range partitioning
• UAV partitioning
• Memory allocation
• Database storage topology
• Database storage object allocation
This paper covers all but “Database storage topology” and “Database storage object
allocation,” which are covered in “Best Practices: Database Storage” white paper. This
This phase is variably referred to in the industry as logical database design or physical database design. It’s known as logical
database design in the sense that it can be designed independent of the data server or the particular DBMS used. It is also often
performed by the same people who perform the early requirements building and entity relationship modeling. Conversely, it is also
called physical database design in the sense that it affects the physical structure of the database and its implementation. For the
sake of this document we use the latter assumption, and therefore include it as part of physical database design.
Physical Database Design Page 7
white paper and others mentioned throughout this paper are available at the DB2 Best
Practices website at http://www.ibm.com/developerworks/db2/bestpractices/.
Physical database design is as old as databases themselves2. The first relational databases
were prototypes (in the early 1970s). As relational database systems advanced, new
techniques were introduced to help improve operational efficiency. The most elementary
problems of database design are table normalization and index selection, both of which
are discussed below.
Today, we can achieve I/O reductions by properly partitioning data, distributing data,
and improving the indexing of data. All of these innovations (which improve database
capabilities, expand the scope of physical database design, and increase the number of
design choices) have resulted in the increased complexity of optimizing database
structures. Although the 1980s and 1990s were dominated by the introduction of new
physical database design capabilities, the years since have been dominated by efforts to
simplify the process through automation and best practices.
The vast majority of physical database design features and attributes have the primary
goal of reducing I/O use at run time. However, to a lesser degree, there are “physical
design aspects” that help improve administrative efficiency and reduce CPU or network
use. In addition, in the DB2 partitioned environment, the database design influences the
degree of parallel processing, for example, parallel query processing.
The best practices presented in this document have been developed with the reality of
today’s database systems in mind and specifically address the features and facilities
available in DB2 9.5.
Assumptions about the reader
It is assumed that you are familiar with the physical database design features described.
Therefore, only a very brief description of each one is provided. The focus of this paper is
on the best practices for applying these features. For details on each respective feature,
refer to the DB2 product documentation.
The relational model for databases was first proposed in 1970 by E.F Codd at IBM. The first relational database systems to be
implemented, using SQL and B+ tree, were IBM’s System R, in 1976, and Ingres at the University of California, Berkeley. The B+
tree, the most commonly used indexing storage structure for user-designed indexes, was first described in the paper “Organization
and Maintenance of Large Ordered Indices” by Rudolf Bayer and Edward M. McCreight, 1972.
Physical Database Design Page 8
Goals of physical database design
A high-quality physical database design is one that meets the following goals:
• Minimizes I/O
• Balances design features that optimize query performance concurrently with
transaction performance and maintenance operations
• Improves the efficiency of database management, such as roll-in and roll-out of
• Improves the performance of administration tasks, such as index creation or
backup and recovery processing
• Minimizes backup and recovery elapsed time
Physical Database Design Page 9
Datatype selection best practices
When designing a physical database, the selection of appropriate datatypes is an
important consideration that should not be overlooked.
Often, abbreviated or intuitive codes are used to represent a longer value in columns, or
to easily identify what the code represents; for example, an account status column whose
codes are OPN, CLS, and INA (representing an account that can be open, closed, or
From a query processing perspective, numeric values can be processed more efficiently
than character values, especially when joining values. Therefore, using a numeric
datatype can provide a slight benefit.
While using numeric datatypes might mean that interpreting the values that are being
stored in a column is more difficult, there are appropriate places where the definitions of
numeric values can be stored for retrieval by end users, such as:
o Storing the definitions as a domain value in a data modeling tool such as
Rational Data Architect, where the values can be published to a larger team
using metadata reporting
o Storing the definition of the values in a table in a database, where the definitions
can be joined to the values to provide context, such as text name or description
(tables that store values of columns and their descriptions are often referred to as
reference tables or lookup tables)
Another concern that is often raised is that, for a large databases, this storing of
definitions could lead to the proliferation of reference tables. While this is true, if an
organization chooses to use a reference table for each column that is used to store a code
value, it is possible to consolidate these reference tables into either a single or a few
reference tables. From these consolidated reference tables, virtual views can be created to
represent the lookup table for each column.
Example of virtual views that represent a lookup table for each
In the following diagram, the TCUSTOMER table has two columns that use code values:
CUST_TYPE and CUST_MKT_SEG. In this scenario, a reference table is created for each
column that uses a code, resulting in two reference tables, TCUST_TYPE_REF and
Physical Database Design Page 10
This approach is not flexible because any time a new column is added that employs the
use of a code value, a new reference table must be created.
A possible solution is to consolidate the reference table into a single reference table
(TREF_MASTER), as shown in the following diagram:
In this diagram, two virtual views, VCUST_TYPE_REF and VCUST_MKT_SEG_REF,
were created from the TREF_MASTER table to represent the reference tables in the
The benefit to this approach is that end users can still use the reference table (without
having to write complex SQL) by simply accessing the reference views for each column.
In addition, the DBA will only maintain a single table for all of the reference data, and the
proliferation of reference tables is limited.
Physical Database Design Page 11
To understand how the VCUST_TYPE_REF view was created, here is the SQL:
VALUE as CUST_TYPE,
VALUE_NME as CUST_TYPE_NME,
VALUE_DESC as CUST_TYPE_DESC
TBL_SCHEMA = ‘REFTB’
AND TABLE = ‘REF_MASTER’
AND COLUMN = ‘CUST_TYPE’
Use the following best practices when selecting datatypes:
Always try to use a numeric datatype over a character datatype, taking the
following considerations into account:
o When creating a column that will hold a Boolean value (“YES” or “NO”),
use a decimal (1,0) or similar datatype. Use 0 and 1 as values for the
column rather than “N” or “Y”.
o Use integers to represent codes.
o If there will be less than 10 code values for a given column, decimal (1,0)
datatype is appropriate. If there are more than 9 code values that will be
stored in a given column, use smallint.
Store the definitions as a domain value in a data modeling tool, such as Rational
Data Architect, where the values can be published to a larger team using
Store the definition of the values in a table in a database, where the definitions
can be joined to the value to provide context, such as “text name” or
Physical Database Design Page 12
Table normalization and denormalization best
Table normalization is the restructuring of a data model by reducing its relations to their
simplest forms. It is a key step in the task of building a logical relational database design.
Normalization helps avoid redundancies and inconsistencies in data; it is typically a
logical data modeling exercise, whose outcome might be implemented in the physical
There are a few goals for deploying a normalized design:
• Eliminate redundant data, for example, storing the same data in more than one
• Enforce valid data dependencies by only storing related data in a table, and
dividing relational data into multiple related tables.
• Maximize the flexibility of the system for future growth in data structures.
The two or three dominant strategies for normalization are:
• Third normal form (3NF), which is used in online transaction processing (OLTP)
and many general-purpose databases, including enterprise data warehouses
(also called atomic warehouses).
• Star schema and snowflake, which are dimensional model forms for normalization,
and are used heavily in data warehousing and OLAP.
Specify non-enforced RI on FK columns to reduce table access for STAR JOINs without
incurring the overhead of RI.
Third normal form (3NF)
3NF is a combination of the rules from first normal form and second normal form. The
following rules are specific to 3NF:
• Eliminate repeating groups. Make a separate table for each set of related
attributes, and give each table a PK.
• Eliminate duplicate columns and redundant data in each table.
• Move subsets of columnar data that apply to multiple rows of a table into
• Create relationships between the tables by using FKs.
Physical Database Design Page 13
• Eliminate columns not dependent on keys. If attributes do not contribute to a
description of a key, move them into a separate table.
• Remove columns not dependent upon the PK.
1NF, 2NF, and 3NF of database design
The following diagrams demonstrate the first, second, and third normal forms of
First normal form (1NF):
To make the denormalized model comply with 1NF, the repeating group of data
elements, the customer address lines, and the customer names were normalized into
Physical Database Design Page 14
Second normal form (2NF):
For the model to comply with 2NF, it must comply with 1NF and any attributes must be
fully dependent on a part of a composite key.
Third normal form (3NF):
For the model to comply with 3NF, any transitive dependencies must be eliminated.
Transitive dependencies occur when a value in a non-key field is determined by the
value of another non-key field that is not part of a candidate key.
Physical Database Design Page 15
Star schema and snowflake models
The star schema and snowflake models have become quite popular for data warehousing BI
systems. The basis of star schema is the separation of the facts of a system from its
dimensions. Dimensions are defined as attributes of the data, such as the location, or
customer name, or part description, and the facts refer to the time-specific events related
to the data.
For example, a part description does not typically change over time, so it can be designed
as a dimension. Conversely the number of parts sold daily varies over time and is
therefore a fact. A star schema is called that because it is typically characterized by a
large central fact table that holds information about events that vary over time,
surrounded (conceptually) by a set of dimension tables holding the meta attributes of
items that are referenced within the fact events.
A snowflake is basically an extension of a star schema. In a snowflake design, the low
cardinality attributes are often moved from a dimension table in a star schema into
another dimension table and then a relationship is created between the two dimension
In contrast to normalization, denormalization is the process of collapsing tables and,
therefore possibly increasing the redundancy of data within a database. Denormalization
can be useful in reducing the complexity or number of joins, and reducing the complexity
of a database by reducing the number of tables. The primary goal of denormalization is
to maximize performance of a system and reduce the complexity in administering the
IBM Layered Data Architecture
IBM Layered Data Architecture offers multiple levels of granularity. Each layer provides
a different level of detail and data summarization appropriate to user needs, which users
(analysts and executives) can access. As data ages, it rolls up through the layers (with
more tables and less data per table). This architecture is designed specifically for mixed
workloads, query performance, rapid incorporation of new data sources, and
deployment of new applications.
The layered architecture enables concurrent loading, query, archive and maintenance
without compromising query performance. The multiple levels of data granularity are
available for multiple types of analytics.
Figure 2 shows the 5 layers (or floors) of the IBM Layered Data Architecture.
Physical Database Design Page 16
Figure 2 IBM Layered Data Architecture
With this model, warehouse administrators can:
1. Use visual modeling tools to optimize the design of multilayered warehouse
2. Use their preferred extract, transform, and load (ETL) software to bulk-load the
staging layer of the warehouse—with scale, speed and rich transformations from
myriad enterprise data sources.
3. Use SQL Warehousing Tools (SQWs) to maintain analytic structures in the
performance and business access layers—or to replace hand-coded SQL flows
anywhere inside the warehouse.
This layered architecture is a powerful paradigm that is too detailed to describe at length
here. Refer to “Best Practices for Creating Scalable High Quality Data Warehouses with
DB2” in the “Further reading” section for detailed information on this layered
Use the following normalization and denormalization best practices:
Physical Database Design Page 17
• Use 3NF whenever possible for most OLTP and general-purpose database
designs to maintain flexibility in the design of the system. It is a tried-and-true
• For data warehouses and data marts that require very high performance, a star
schema or snowflake model is typically optimal for dimensional query
processing. However, verify that the star schema or snowflake model conforms
to the relationships that you designed in the normalized logical data model.
More information about logical modeling for users of Rational Data Architect is
available in “Best Practices: Data Life Cycle Management” white paper.
• For broad-based data warehousing that is used for several purposes, such as
operational data stores, reporting, OLAP and cubing, use the IBM Layered Data
Architecture illustrated in Figure 2.
• Consider denormalizing very narrow tables, ones with a row length of 30 or
fewer bytes. Extra tables in a database increase query complexity and complicate
Physical Database Design Page 18
Index design best practices
Indexes are critical for performance. They are used by a database for the following
• Apply predicates to provide rapid look up of the location of data in a database,
reducing the number of rows navigated
• To avoid sorts for ORDER BY and GROUP BY clauses
• To induce order for joins
• To provide index-only access, which avoids the cost of accessing data pages
• As the only way to enforce uniqueness in a relational database
However, indexes incur additional hardware resources:
• They add extra CPU and I/O cost to UPDATE, INSERT, DELETE, and LOAD
• They add to prepare time because they provide more choices for the optimizer
• They can use a significant amount of disk storage
In DB2 database systems, a B+ tree structure is used as the underlying implementation
for indexes. All data is stored in the leaf nodes, and the keys are optionally chained in a
bidirectional manner to allow both forward and backward index scanning. If DISALLOW
REVERSE SCANS is specified then the index cannot be scanned in reverse order.
Clustering indexes (also called special indexes) indicate to the database manager that data in
the table object should be clustered in a specific order, on disk, according to the definition
of the index. For example, if the clustering index is defined on a date key, then the DB2
database manager will attempt to store, in the table object, rows with similar dates in
ascending date sequence.
The table in Figure 3 has two row-based indexes defined on it:
• A clustering index on Region
• Another index on Year
Physical Database Design Page 19
Figure 3. A regular table with a clustering index
The value of this clustering is that subsequent queries that have predicates on the
clustering attribute need to perform dramatically reduced I/O. For example, a query on
sales by date will perform far less I/O if the rows for the selected dates are stored next to
each other on disk.
However, clustering indexes are merely an indicator to the database, and as new rows
are inserted into the database the DB2 kernel attempts to place these rows near rows with
the same or similar attributes. If space is unavailable, the incoming or changed row might
be redirected to another location that is unclustered (that is, not near the related rows).
When an INSERT occurs (or an UPDATE to the clustering keys) the DB2 kernel
navigates, top down, scanning the clustering index to determine an appropriate location
for the row. Therefore, INSERT, and some UPDATE operations on a table with a
clustering index, incurs the overhead of index access that an unclustered table would not.
Techniques like “append on” (APPEND ON option on the CREATE and ALTER TABLE
statements) can minimize this overhead by placing all new rows at the end of the table.
Therefore, clustering indexes provide approximate clustering, and data often becomes
unclustered over time. The REORG utility can be used to reorganize the data rows back
into perfect cluster order, although, for online REORGs, this can be a time-consuming
and log-intensive operation.
Physical Database Design Page 20
To create clustering indexes, simply add the CLUSTER keyword on the create index
statement as shown in the following example, where a clustering index MyIndex will be
created on column C1 of table T1. There can be only one clustering index per table.
CREATE INDEX MyIndex on T1 (C1) CLUSTER
Because data clustering can deteriorate over time when using a clustering index,
clustering with MDC is preferred as a best practice as it guarantees clustering at all times,
and provides the option to clustering along multiple dimensions concurrently. See the
discussion on MDC for help on determining which method to use.
Utilize the following index design best practices:
• Index every PK and most FKs in a database. Most joins occur between PKs and
FKs, so it is important to build indexes on all PKs and FKs whenever possible.
Indexes on FKs also improve the performance of RI checking.
• Explicitly provide an index for the PK. The DB2 database manager indexes the
PK automatically with a system-generated name if one is not specified. The
system-generated name for an automatically-generated index is difficult to
• Columns frequently referenced in WHERE clauses are good candidates for an
index. An exception to this rule is when the predicate provides minimal filtering.
An example is an inequality such as WHERE cost <> 4. Indexes are seldom
useful for inequalities because of the limited filtering provided.
• Specify indexes on columns used for equality and range queries.
• Create an index for each set of fact table columns that join to a dimension. These
columns do not have to be part of an explicit FK. Creating the index allows STAR
JOIN access to plans that use dynamic bitmap index ANDing. Consider creating
indexes on combinations of fact-table columns.
For example, if PRODKEY and STOREKEY join to the product and store the
dimension respectively, consider creating an index on (PRODKEY, STOREKEY).
This facilitates a hub or cartesian STAR JOIN access plan.
• Use the db2pd command, which indicates the number of times that indexes were
used in order from highest to lowest. This can be helpful in detecting which
indexes are commonly used. For example:
db2pd -db MY_DATABASE -tcbstats index
The indexes are referenced using the IID, which can be linked with
SYSIBM.SYSINDEXES's IID for the index. At the end of the output (shown below
Physical Database Design Page 21
in two sections) is a list of index statistics. “Scans” indicates read access on
each index, while the other indicators in the output provide insight on write and
update activity to the index.
Left side of report:
Right side of report:
• Use the DB2 Design Advisor to indicate which indexes are never accessed for a
specified workload and can therefore be dropped.
• Add indexes only when absolutely necessary. Remember that indexes
significantly impact INSERT, UPDATE, and DELETE performance, and they also
• To reduce the need for frequent reorganization, when using a clustering index
specify an appropriate PCTFREE at index creation time to leave a percentage of
free space on each index leaf page as it is created. During future activity, rows
can be inserted into the index with less likelihood of causing index page splits.
Page splits cause index pages not to be contiguous or sequential, which in turn
results in decreased efficiency of index page prefetching.
Note: The PCTFREE specified when you create the relational index is retained
when the index is reorganized.
Dropping and recreating, or reorganizing, the relational index also creates a new
set of pages that are roughly contiguous and sequential and improves index page
prefetch. Although more costly in time and resources, the REORG TABLE utility
also ensures clustering of the data pages. Clustering has greater benefit for index
scans that access a significant number of data pages.
• Examine queries with range or with ORDER BY clauses to identify clustering
• Clustering indexes incur additional overhead for INSERT and some UPDATE
operations. If your workload performs a large amount of updates, you will need
to weigh the benefits of clustering for queries against the additional cost to
INSERTS and UPDATES. In many cases, the benefit far outweighs the cost, but
Physical Database Design Page 22
• Avoid or remove redundant indexes. An example of a redundant index is one
that contains only an account number column when there is another index that
contains the same account number column as its first column. Indexes that use
the same or similar columns make query optimization more complicated, use
storage, seriously impact INSERT, UPDATE, and DELETE performance, and
often have very marginal benefits.
Although the DB2 database system provides dynamic bitmap indexing, index
ANDing, and index ORing, it is good practice to specify composite indexes,
referred to as multiple column indexes, if these columns are frequently specified in
• Choose the leading columns of a composite index to facilitate matching index
scans. The leading columns should reflect columns frequently used in WHERE
clauses. The DB2 database system navigates only top down through a B-tree
index for the leading columns used in a WHERE clause, referred to as a matching
index scan. If the leading column of an index is not in a WHERE clause, the
optimizer might still use the index, but the optimizer is forced to use a non-
matching index scan across the entire index.
Physical Database Design Page 23
Data clustering and multidimensional clustering
(MDC) best practices
MDC is a technique for clustering data along more than one dimension at the same time.
However, you can also use MDC for single-dimensional clustering, just as you can use a
clustering index. An advantage of an MDC table is that it is designed to always be
clustered. A reorganization is never required to re-establish a high-cluster ratio.
To understand MDC, you must first understand some basic terminology: Cells are the
portion of the table containing data having a unique set of dimension values—the
intersection formed by taking a slice from each dimension. Blocks are the unit of storage
equal to an extent size (one or more pages) that is used to store a cell. Your extent size
specification determines the size of the block (or cell).
Block indexes for MDC tables
Unlike traditional indexes created by the CREATE INDEX syntax, which index each row
in a table, MDC indexes the rows in the table by block, called block indexes. MDC block
indexes are typically 1/1000th of the size of row-based indexes, and provide not only
huge savings in storage for the index, but massive efficiencies on all block index
operations (such as index scan, index ANDing, and index ORing). INSERT and UPDATE
operations are also enhanced because the block index is only updated if a new cell is
As shown in Figure 4, block indexes provide a significant reduction in disk usage and
significantly faster data access:
Physical Database Design Page 24
Figure 4. How row indexes differ from block indexes
The MDC table shown in Figure 5 is physically organized such that rows having the
same Region and Year values are grouped together into separate blocks, or extents.
MDC block indexes are created for each dimension as well as the composite dimension.
For example, if the dimensions for a table are Region,Year then a block index is built
for Region, for Year, and for the composite dimension Region,Year.
Physical Database Design Page 25
Figure 5. A multidimensional clustering table (MQT)
An MDC table defined with even just a single dimension can benefit from these MDC
attributes, and can be a viable alternative to a regular table with a clustering index. This
decision should be based on many factors, including the queries that make up the
workload, and the nature and distribution of the data in the table. A high cardinality
column is not a good choice for a single-dimension MDC because you will get a cell for
each unique value.
Maintaining clustering automatically during INSERT
Automatic maintenance of data clustering in MDC tables is ensured using composite block
indexes3. These indexes are used to dynamically manage and maintain the physical
clustering of data along the dimensions of the table over the course of INSERT
operations. When an insert occurs, the composite block index is probed for the logical cell
corresponding to the dimension values of the row to be inserted. The block index is not
updated unless a new cell is created.
A composite block index is automatically created and contains all columns across all dimensions. It is used to maintain the
clustering of data over insert and update activity, and might also be selected by the optimizer to efficiently access data that satisfies
values from a subset, or from all, of the column dimensions.
Physical Database Design Page 26
As shown in Figure 6, if the key of the logical cell is found in the index, its list of block ID
(BIDs) gives the complete list of blocks in the table having the dimension values of the
local cell. This limits the number of extents of the table to search for space to insert the
Figure 6. Composite block index on YearAndMonth, Region
Because clustering is automatically maintained, reorganization of an MDC table is never
needed to re-cluster data. Also, MDC can reuse empty cells that result from the mass
deletion of rows without a REORG. However, reorganization can still be used in rare
situations to reclaim space. For example, if cells have many sparse blocks where data
could fit on fewer blocks, or if the table has many pointer-overflow pairs, a
reorganization of the table would compact rows belonging to each logical cell into the
minimum number of blocks needed, as well as remove pointer-overflow pairs.
Benefits of using MDC
The value of MDC is profound. It improves complex query performance by 10 times in
some cases and you can use it for roll-in and roll-out of data. Other benefits include the
• MDCs are multi-dimensional. For example, data can be perfectly clustered along
DATE and LOCATION dimensions; cells and ranges are created automatically as
new data arrives.
• MDCs can be used in conjunction with normal RID-based indexes, range
partitioning, and MQTs. Index ANDing or ORing of block-based and RID-based
indexes is a possible access path that can be chosen by the DB2 Optimizer.
• MDCs are used with intra-query parallelism, DPF (shared nothing) parallelism,
and LOAD, BACKUP, and REORG operations.
Physical Database Design Page 27
• MDC dimensions, unlike range-partitioned tables, are dynamic; new cells get
created within the table automatically as unique new data representing new cells
arrives in the table either through SQL operations (including JDBC, CLI, and so
forth), or through utility operations such as LOAD and IMPORT. Empty cells can
also be reused during these operations.
• MDCs maintain clustering, and, as such, do not need REORGs to maintain
The following example shows how to define an MDC table:
CREATE TABLE T1
c5 INT generated always as (INT(C1)/100) )
ORGANIZE BY DIMENSIONS (c5, c3)
The ORGANIZE BY clause defines the clustering dimensions. The table is clustered by
C5 and C3 at the same time. C1 is coarsified4 to C5, which contains fewer distinct values
(days are reduced to months).
NOTE: The coarsified generated column(s) are used in the MDC block indexes to
perform cell-level elimination of data. Calculated columns are fully supported by MDC
and the DB2 Optimizer.
The key design challenge of MDC is the careful selection of the clustering dimensions. If
you choose clustering dimensions that result in too many cells, storage costs can increase
substantially. The reason for this is important to understand. In an MDC table, every cell
is allocated as many storage blocks on disk as required. Storage blocks are by design
equal to the extent size of the table space that holds a table. The number of storage blocks
is 0 if a cell has no data.
However, in a typical table a cell stores several rows, resulting in one or more storage
blocks being allocated to the cell. For every cell that has data, there is a chain of blocks,
which typically contains a partially filled block. Therefore, there could be wasted storage
for each cell (not each block), proportional to the size of the storage block. New blocks are
created only when the previous block is full (or nearly full). If rows are deleted and the
cell is empty, the database manager can reuse the space and avoid the need for a
reorganization (for space reclamation).
Storage blocks are by design equal to the extent size of the table space that holds a table.
If the number of cells in the table is very large, the storage waste is large. If MDC is poor
and results in a huge number of cells, the table storage requirement expands
dramatically, and MDC can also be a performance detriment. However, when designed
The term coarsification refers to a mathematics expression to reduce the cardinality (the number of distinct values) of a clustering
dimension. A common example of a coarsification is the date where coarsification could be by date, week of the date, month of the
date, or quarter of the year.
Physical Database Design Page 28
well, MDC tables are only slightly larger than non-MDC tables, and offer profound
benefits for clustering and roll-in and roll-out of data (as discussed in the paragraphs that
follow). The key is to use low-cardinality columns for the dimensions of an MDC.
Figure 7 shows storage block and cell allocation. As shown, each cell contains a set of
storage blocks. Most of the blocks are filled with data, but for each cell there is a block at
the end of the chain which is partially filled to a lesser or greater degree.
Figure 7 MDC storage by cell
If you have sample or actual data, using SQL, you can measure the number of expected
MDC cells for any given potential MDC design, as follows:
SELECT COUNT(*) FROM (SELECT DISTINCT COL1, COL2, COL3 FROM
MY_FAV_TABLE) AS NUM_DISTINCT;
COL1, COL2, and COL3 represent the MDC dimensions for a 3-dimensional MDC table.
The resulting number multiplied by the extent size of the table will give you an upper
bound on the extent growth (not size) of the table when converted to MDC.
As described in the previous section, another key value of MDC is that the DB2 database
manager automatically creates indexes for MDC tables over the MDC dimensions of the
table. These special indexes (call block indexes) index data by block instead of by row. This
Physical Database Design Page 29
results in associated run time performance benefits for queries and minimal overhead for
INSERT, UPDATE and DELETE operations.
MDC provides features that facilitate the roll-in and roll-out of data:
o MDC has much less block index I/O during the roll-in process because the block
index is only updated once when the block is full (not for every row inserted).
o Inserts are also faster because MDC reuses existing empty blocks without the
need for index page splitting.
o Locking is reduced for inserts because they occur at a block level rather than at a
o There is no need to REORG data after roll-in and roll-out.
MDC storage scenario
You want to create an MDC for a Transaction Fact on Date, Product Name, and Region.
Here are some variables to consider for the MDC creation:
• There are 365 days in a year
• There are 100,000 products for company XYZ
• There are 10 regions for company XYZ
Initial MDC creation
If the MQT was created strictly on the Date, Product and Region column, there would be
1,000,000 new cells created daily (1 x 100,000 x 10) and 365 million cells per year
(previous x 365).
In regions where transactions are low, there will be a lot of sparse pages, and even empty
pages. This could lead to a lot of unnecessary space being used by allocating so many
cells (pages) to contain this block of data. This is not good.
Improving the creation of the MDC
Use functions to coarsify and limit MDC cardinality. For example:
• If you use the month function on the Date, you would have 12 results per year
• If you substring the Product name to pick the first character of the Product name,
you could have 26 potential results
• Leave Region as is with 10 results
Using the recommendation in this scenario, every year, the MDC would have 12*26*10 =
3210 cells or about 8-9 cells per day. This would eliminate the scarcity of data on many of
Physical Database Design Page 30
the pages, and provide a reasonable cardinality for the MDC to be effective in providing
a performance benefit.
MDC run time overhead and benefit considerations
MDC is designed to provide large performance benefits for queries and improvement for
many DELETE scenarios. Even so, MDC tables do incur overhead over non-clustered
tables, while offering significant performance benefits over tables that are clustered using
a clustering index. Consider first the overhead of MDC versus an unclustered table:
• INSERT operations on a non-clustered table access each index to add a reference
to the inserted row. In contrast, INSERT on an MDC table requires an initial read
to the MDC composite block index in order to determine to which cell and block
the row belongs, followed (after the insert on the table) by access to each index in
order to insert a reference to the row. (Clustering indexes incur a similar
• If the MDC table includes a generated column to coarsify one of the dimensions,
every INSERT will incur a small processing overhead to compute the generated
value for that column as all generated columns in DB2 are fully materialized, that
is, calculated and stored within the row.
However, when compared to a table clustered with the use of a clustering index, MDC
offers significant performance advantages:
• Index maintenance is dramatically reduced during INSERTs compared to the
processing required for a clustering index, as the DB2 database manager only
updates the block index when the first key is added to a block—unlike a RID-
index where every single inserted row to the table requires an update to all
indexes. That is, if there are 1000 rows per block, the rate of index updates is
1/1000th what it would be for a RID index.
• The index update is cheaper, because the index is smaller and therefore has
fewer levels in the tree. Fewer levels in the B+-tree means less processing to
determine the target leaf page for the index entry.
In both cases, whether clustered by a clustering index or by MDC, the DB2 database
manager will access the index (clustering index of the block index) during INSERT to
determine the target location of the row. Again the index is much smaller, and the height
of the tree usually shorter resulting in a faster search.
Determining when to use MDC versus a clustering index
MDC provides huge value over a clustering index because the clustering is guaranteed
and automatic. In general you can achieve cluster ratios with MDC anywhere between
93%-100% depending on the coarsification needed. In contrast, clustering indexes can
cluster data close to 100% initially, but becomes declustered over time, and might require
time-consuming REORG to recluster the data. In general, use MDC to create and
maintain data clustering in your database unless:
Physical Database Design Page 31
• MDC would require coarsification and you are unable to add a generated
column to your table.
• The MDC version of the table results in table growth you are unable or unwilling
to incur. Well-designed MDC tables are typically 2-15% larger than non-MDC
• You find that MDC clustering will give you a lower cluster ratio (for example,
93%) due to coarsification and you are willing to incur the periodic REORG
processing in order to get the improved clustering that can be achieved with a
Use the following MDC design best practices:
• Start your selection for MDC candidates by looking for columns that are used as
predicates for equality, inequality, range, and sorting. To improve roll-in of data,
your dimension should match your roll-in range.
• Strive for density! Remember, an extent is allocated for every existing cell—
regardless of the number of rows in that cell. To leverage MDC with optimal
space utilization, strive for densely filled blocks.
• Constrain the number of cells in an MDC design. Keep the number of cells
reasonably low to limit how much additional storage the table will require when
converted to MDC form. 5% to 10% growth for any single table is a reasonable
goal. (See the discussion on MDC cells in the “Benefits of using MDC” section.)
There are exceptions, where even double the amount of growth is useful, but
they are rare.
Note: Block indexes are usually so small as a percentage of the corresponding
table size that, in most cases, you can ignore the storage required for them.
• Coarsify some dimensions to improve data density. Use generated columns to
create coarsifications of a table column that have much lower column cardinality.
For example, create a column on the month-of-year part of a date column, or use
(INT(colname))/100 to convert a DATE column with the format Y-M-D to Y-M.
CREATE TABLE Sales
MONTH GENERATED ALWAYS AS
ORGANIZE BY (MONTH, REGION, PRODUCT)
For the query:
Physical Database Design Page 32
select * from sales where sales_date>”2006/03/03” and
The compiler generates the additional predicates:
month>=200603 and month<=200701
To reduce wasted space, specify a small table space extent size, which reduces
your MDC Block Size.
• Don’t select too many dimensions. It is very rare to find useful designs that have
more than three MDC dimensions without unreasonable storage requirements.
The more dimensions you have, the more the cardinality of cells will increase
exponentially. This makes it extremely hard to constrain the expansion of the
MDC table to the design goal of approximately 10% (versus a non-MDC table). If
the table expands unreasonably (for example, more than two times its non-MDC
size) not only will you require more storage, but the gains of clustering might be
lost due to the increase in doing I/O on partially filled blocks.
A simple example: Consider a table with three dimensions worth clustering on,
each with 10,000 unique values. If these columns have no correlation between
them, then clustering on all three dimensions without coarsification would result
in 10,000 x 10,000 x 10,000 cells, with a partially filled block per cell. If each block
is 1MB, the overhead from this careless design would be around 500,000 TB!
• Consider single-dimensional MDC. Single-dimensional MDC can still provide
massive benefits compared to those of traditional single dimensional clustered
indexes. The reasons are that:
o Clustering is guaranteed.
o MDC tables are indexed by block and not by row, resulting in indexes that
are roughly 1/1000 the size of traditional row-based indexes.
o DELETE performance using MDC roll-out is improved. RID indexes on
MDC are updated asynchronously with DB2 9.5.
o MDC facilitates roll-in of data.
o Use single-dimensional MDC (with coarsification if needed) to enforce
clustering instead of using a clustering index. Clustering indexes cluster
data on a best effort basis (there are no guarantees of how well they
cluster), and over time, they tend to become unclustered. In contrast
MDC guarantees clustering, avoiding the need to reorganize data. (See
the coarsification example in the “MDC Scenario” section.)
Physical Database Design Page 33
• Be prepared to tinker (on a test database). It might take trial and error to find an
MDC design that works really well. Use the DB2 Design Advisor with the –m C
option (C for clustering search). You can also use the db2mdcsizer utility, which
determines space requirements and simplifies administration of MDC tables.
This utility is available on AlphaWorks for certain versions of DB2 products.
MDC modifications will not impact your application programs.
• Use the MDC selection capability of the DB2 Design Advisor with a
representative workload to find suitable MDC dimensions for an existing table.
Physical Database Design Page 34
Database partitioning (shared-nothing hash
partitioning) best practices
Database partitioning is a technique for horizontally distributing rows in the database
across many database instances that work together to form a single large database server.
These instances can be located within a single server, across several physical machines, or
a combination. In DB2 products, this is called the Database Partitioning Facility (DPF).
Database partitioning allows the DB2 database manager to scale to hundreds of instances
that participate in the larger database system. The scalability of this design can approach
near linear scaleout for many complex query workloads. As such, database partitioning
has become extremely popular for data warehousing and BI workloads due to its near
linear scaleout characteristics and its ability to scale to hundreds of terabytes of data and
hundreds of CPUs. The architecture is less popular for OLTP processing due to the inter-
instance communication incurred on each transaction, which though small, can still be
very significant for short running transactions typically found in OLTP workloads. DPF
might be used for OLTP applications that require a cluster of computers for throughput.
Shared-nothing hash partitioning hashes rows to logical data partitions. The primary
design goal of hash distribution is to ensure the even distribution of data across all
logical nodes (as range partitioning tends to skew data). These partitions might reside
within a single server or be distributed across a set of physical machines, as shown in
Figure 9 Table hash-partitioning
Physical Database Design Page 35
The scalability of shared-nothing databases has proven to be nearly linear for a wide
range of complex query workloads. Also, the modular nature of the design lends itself to
linear scaleout as storage pressures, workload pressures, or both grow. As a result,
shared-nothing architectures have dominated data warehousing for the past decade.
Database partitioning is implemented without impact on existing application code, and is
completely transparent. Partitioning strategies can be modified online with the
redistribution utility without affecting application code.
The primary design choice is determining which columns to use to hash partition each
table that comprises the database-partitioning key. The goals are twofold:
1. Distribute data evenly across database partitions. This requires choosing
partitioning columns that have a high cardinality of values to ensure an even
distribution of rows across the logical partitions.
2. Minimize shipping of data across database partitions during join processing.
Collocation of rows being joined will occur (avoiding movement) if the
partitioning key is included in the WHERE clause.
Another central problem in designing shared-nothing data warehouses is determining
the best combinations of memory, CPUs, buses, storage capacity, storage bandwidth, and
networks. How much or how many do you need of each of these?
To help solve this problem, IBM provides the IBM Balanced Warehouse™, which is
based on DB2 database system’s shared nothing architecture. It was developed through
IBM best practices used for successful client implementations.
Balanced Warehouse and Balanced Configuration Units (BCU)
The Balanced Warehouse combines building blocks known as Balanced Configuration
Units (BCU). These building blocks are preconfigured, pre-tested, and tuned for
performance to provide an ideal volume and ratio of system resources. The BCU
combines the best practices for database configuration and hardware components to
greatly simplify warehouse setup and deployment. Scores of best practices for resource
ratios and database configuration have been incorporated into the Balanced Warehouse.
Figure 10 shows the various Balanced Warehouse offerings for 2007 and 2008. 5 You can
see that the Balanced Warehouse currently offers three classes of offerings, C, D and E.
These three classes offer increasing power and scalability to the solution. The C class is an
entry level offering intended for SMB markets, or systems integrators that can be
contained in a single server. D and E class offerings scale out to much larger
configurations using DB2 database partitioning capabilities.
For an up-to-date version of the Balanced Warehouse offerings refer to the Balanced Warehouse web pages online at:
Physical Database Design Page 36
Figure 10 Balanced Warehouse offerings6, 2007-2008
Use the following database partitioning best practices:
• Select partitioning keys that have a large number of values (high cardinality) to
ensure even distribution of rows across partitions. Unique keys are good
candidates. If you are having a difficult time finding a key that can distribute
data evenly across partitions, you might want to consider using a function on a
• Avoid choosing a partitioning key with a column that is updated frequently; this
could incur additional overhead on the update to repartition the row to another
• If possible, as your partitioning key, try to choose a column that has a simple
datatype, such as fixed-length character or integer. The hashing performance can
benefit from doing this versus selecting a complex datatype.
• To increase collocation, consider using the join column as the partitioning key for
a table that is frequently joined (provided that the columns have high cardinality
to satisfy the even distribution of rows). Select the minimum number of columns
required to achieve high cardinality and even distribution of rows in the
Prices reflected in the “Estimated Cost” in this table are current as of May 2008, exclude applicable taxes, and are subject to
change by IBM without notice.
Physical Database Design Page 37
partitioning key. Reducing the number of columns in the partitioning key
improves the likelihood that the column will be in the join predicates (improving
the odds of collocation).
• Ensure that unique indexes are a superset of the partitioning key.
• Use replicated MQTs for small tables (tables that are less than 3% of the total
database size, or less than 5% of the largest table size are a reasonable rule of
thumb) or infrequently updated tables in order to:
o Improve collocation and reduce movement over the network
o Assist in the collocation of joins
o Improve performance of frequently executed joins in a partitioned
database environment by allowing the database to manage precomputed
values of the table data.
CREATE TABLE R_EMPLOYEE
SELECT EMPNO, FIRSTNME, MIDINIT, LASTNAME,
DATA INITIALLY DEFERRED REFRESH IMMEDIATE
To update the content of the replicated MQT, run the following statement:
REFRESH TABLE R_EMPLOYEE;
Note: After using the REFRESH statement, you should run RUNSTATS on the
replicated table as you would on any other table.
• Collocate the largest dimension-table’s key as the partition key for the fact table,
considering the number of distinct values and skew within the corresponding
• Replicate small dimensions (less volatile) tables, where “small” is relative and
depends on the installation’s available storage.
• Replicate a horizontal or vertical subset of dimensions that don’t match the
partitioning key, as follows:
o Partition any remaining dimensions on their PK.
Physical Database Design Page 38
o After creating a replicated table to improve collocation, remember to
collect table and index statistics (or use the DB2 automatic statistics
collection feature). Remember to implement the same indexes on the
replicated MQTs as you have defined on the base table(s).
o Define replicated MQTs as REFRESH IMMEDIATE if they are small and
rarely updated. Try to limit the number of parallel ETL jobs executing
when REFRESH IMMEDIATE is specified. A deferred refresh strategy
provides less overhead for updates of the base table.
• Distribute large tables on several partitions. Small tables with less than one
million rows should be located on one database partition only.
Physical Database Design Page 39
Table (range) partitioning best practices
Table partitioning should be used predominantly to facilitate improved roll-in and roll-out
of data. It enables an administrator to add a large range of data (such as a new month of
data) to a table, en-masse, and perhaps more importantly it allows an administrator to
remove data from a table, or from the database, en-masse, almost in an instant (without
DB2 database systems' unique asynchronous index-cleanup technology means that even
while using global indexes that index data across several range partitions, a range can be
detached from the table, and the index keys associated with that range become
immediately invisible to incoming queries. The keys are subsequently deleted quietly in a
background process with negligible impact to the executing database workload.
Table partitioning also offers side benefits of increased query performance through an
internal process called partition elimination, which, in many cases, enables the query
compiler to select improved execution plans. This is a secondary benefit of table
Furthermore, table partitioning enables the division of a table into several ranges that are
stored in one or more physical objects within a database logical partition. The goal of
table partitioning is to logically organize data to facilitate optimal data access and the
roll-out of data. The division of the table into ranges is transparent to the application, and
can therefore be designed at any point in the application development cycle.
See “Best Practices: Data Life Cycle Management” white paper for more details on table
partitioning. Other attributes and features of table partitioning include the following
• Each range can be in a different table space
• Ranges can be scanned independently
• Performance for certain BI-style queries is improved through partition
• New ALTER ATTACH/DETACH statements for easier roll-in and roll-out of
o New ATTACH operation for roll-in
o New DETACH operation for roll-out
• SET INTEGRITY is now online (allowing read/write access to older data)
• For new ranges, ADD plus LOAD operations can be used over ATTACH plus
SET INTEGRITY operations
Physical Database Design Page 40
The following example shows how to define a partitioned table:
CREATE TABLE SALES(SALE_DATE DATE, CUSTOMER INT, …)
PARTITION BY RANGE(SALE_DATE)
(STARTING ‘1/1/2006’ ENDING ‘3/31/2008’,
STARTING ‘4/1/2006’ ENDING ‘6/30/2008,
STARTING ‘7/1/2006’ ENDING ‘9/30/2008’,
STARTING ‘10/1/2006’ ENDING ’12/31/2012’);
This statement results in the creation of four table objects, each one of which stores a
range of data, as shown in Figure 8:
Figure 8 Table partitioning by date range
Use the following table partitioning best practices:
• Use table (range) partitioning to rapidly delete (roll-out) ranges of data. Match
range-partitioning periods to roll-in and roll-out ranges. For example, if you
need to roll-in and roll-out data by month, range partitioning by month is a
• Partition on DATE columns. Roll-in and roll-out scenarios are almost always
based on dates. Improved query execution plan (QEP) selection, using partition
elimination7, and a significant set of those opportunities are also based on date
• Limit the number of ranges. Remember that each range is a table object with a
minimum of two extents. Avoid designs with an excessive number of partitions.
A rule of thumb is at least 50MB of data in each range (several gigabytes of data
per range is best). Make the size of your ranges match the size you typically roll-
Partition elimination improves your SQL workload performance. Partition Elimination is a strategy used internally by the query
compiler. The query compiler automatically determines if it can exploit the table partitioning for this purpose. Typically dates can
satisfy the roll-out requirement and often provide partition elimination benefits to many queries.
Physical Database Design Page 41
• When adding new ranges, ADD table partition with a LOAD operation is often
faster than the ATTACH of a partition with subsequent SET INTEGRITY
o The LOAD utility has an option to maintain indexes incrementally, and to
write only a single log row for the event, regardless of how many rows are
inserted into the table. Although the LOAD utility supports concurrent read
access to older data, queries need to be drained.
• Consider separating table partitions in separate table spaces to facilitate backup
and recovery. Table partitions (ranges) can be backed up and restored by table
• Place global indexes, which can be large, in their own individual table space.
Placing all the global indexes in a single table space can impact the elapsed time
of the BACKUP utility (because the index table space can become much larger
than the data table spaces).
• Ensure and maintain the clustering of data by making the range-partitioning key
the leading column in a clustered index (no MDC). Data will not be clustered
properly if your clustered index is not prefixed by your partition key. For
PARTITION BY RANGE (Month, Region)
CREATE INDEX … (Month, Region, Department) CLUSTER
• Use page-level sampling to reduce RUNSTATS time. A sampling rate of 10% to
20% provides good quality statistics with a major performance improvement. For
details, see “Best Practices: Writing and Tuning Queries for Optimal
Performance” white paper.
• Place table partitions in different table spaces; this allows you to backup new
ranges as data is rolled in to the new range, without having to backup the other
partitions. This greatly improves the speed and reduces the size of backup
Physical Database Design Page 42
UNION All View (UAV) partitioning best practices
Prior to the availability of DB2 9 table partitioning, applications often had a requirement
to partition data by ranges. By creating a table for each range with the appropriate
constraints, DBAs were able to provide a single system view by the creating a UAV for
all the tables. For example:
Create Table TestQ1 (Col 1 date)
Alter Table TestQ1
add constraint q1_chk
(month(dt) in (1,2,3)
Repeat the table create/constraint for each quarter:
Create View Test as
Select * from TestQ1
Select * from TestQ2
Table partitioning provides a single view of the table to the compiler and optimizer. This
allows more aggressive predicate push-down to the different ranges than UAV, and a
more consistent model for partitioning data. Table partitioning is the recommended
method for implementing range-based partitioning for most application requirements.
NOTE: UAVs are not a parallel processing method for dividing work across CPUs. The
DB2 Database Partitioning Facility (DPF) should be used for that purpose (see the
discussion on “Database partitioning”).
As with Table partitioning, you can use UAV to store ranges of data in distinct table
spaces, providing granularity for BACKUP operations (see the discussion on “Table
The advantages of the UAV design predominantly revolve around the ability to operate
on some ranges independent of others, or to design some ranges with unique attributes.
Conversely table partitioning provides a homogenous view of a range-partitioned table.
Although table partitioning is generally preferred, there are advantages to UAVs:
• For replication: Historical tables in UAVs can be compressed. (Use UAVs when
replication is needed on certain ranges of data, while other ranges that do not
require replication can benefit from compression.)
• UAVs are utilized to reduce the granularity of utility operations (such as REORG
and RUNSTATS). Utilities can operate on a given table containing a range.
NOTE: REORG is commonly the most important of these. This is valuable when
ranges are changing frequently requiring reclustering or recompression of a
range. UAVs allow this operation to be performed on the subset of ranges that
Physical Database Design Page 43
require it. DB2 9.5 has automatic dictionary rebuild for table partitioning,
alleviating the need to REORG a new range for compression.
• Heavily used ranges can be isolated into separate tables containing additional
indexes or MQTs to optimize data access.
• UAVs provide end users with a single view of federated data (stored in multiple
IBM or non-IBM databases). A UAV can provide a single view of data across
Table partitioning provides the following advantages over the UAV partitioning
• Preparation time is faster (one table instead of multiple tables in a view)
• Simpler management (one table, not multiple tables)
• Less catalog locking for roll-in and roll-out of ranges
• Unique indexes across all ranges supported
• Better handling of complex queries
• Simpler EXPLAINs (using the explain facility)
Migrating UAVs to table partitioning
The migration of UAVs to table partitioning can be achieved without data movement by
following this procedure:
1. Create a partition table with a single dummy partition and with a range that does
not interfere with existing ranges. This requires the same page size and extent
2. ALTER ATTACH all tables in the UAV.
3. Drop the dummy partition.
4. Run SET INTEGRITY after all TABLE ATTACH commands. To speed up set
a) Drop all indexes.
b) Recreate indexes after SET INTEGRITY completes.
Use the following UAV partitioning best practices:
• Use database partitioning to achieve scalability, rather than UAVs.
• As with table partitioning, use UAVs in order to place ranges of data in distinct
table spaces, improving BACKUP granularity.
Physical Database Design Page 44
Recommendation: Migrate UAVs to table partitioning, taking the following
considerations into account:
• Newly developed applications with range-partition requirements should be
implemented with table partitioning rather than with UAVs, unless you have
strong requirements for one or more of the UAV advantages listed above.
• UNION ALL applications being migrated to table partitions utilizing deep
compression should be implemented with DB2 9.5 in order to benefit from
automatic dictionary compression.
Physical Database Design Page 45
Database partitioning, table partitioning, and MDC in
the same database design best practices
Database partitioning, table partitioning, and MDC can be implemented simultaneously
in the same design.
• Database partitioning can be implemented to help achieve scalability and to
ensure the even distribution of data across logical partitions.
• Table Partitioning can be implemented to facilitate Query Partition Elimination
and roll-out of data.
• MDC can be implemented to improve Query Performance and facilitate the roll-
in of data.
This is a best practice approach for deploying large scale applications.
CREATE TABLE TestTable
(A INT, B INT, C INT, D INT …)
IN Tablespace A, Tablespace B, Tablespace C …
INDEX IN Tablespace B
DISTRIBUTE BY HASH (A)
PARTITION BY RANGE (B) (STARTING FROM (100) ENDING (300)
ORGANIZE BY DIMENSIONS (C,D)
See “Best Practices: Data Life Cycle Management” white paper for more details.
To deploy large scale applications, implement database partitioning, table partitioning,
and MDC in the same database design.
Physical Database Design Page 46
Roll-in and roll-out of data with table partitioning
and MDC best practices
Design your partitioning strategy to use table partitioning for your roll-out strategy and
to use MDC on a single dimension for your roll-in strategy.
For example, if you roll-in daily and roll-out monthly, specify an MDC on day and a
Table Partition Key for month (calculated values are supported).
This approach reduces the number of table partitions and eases the DBA administrative
tasks. It takes advantage of the roll-in features of MDC: reduced index I/O with block
indexes and reduced logging.
See “Best Practices: Data Life Cycle Management” white paper for more details.
Use table partitioning for roll-out, and MDC on a single dimension for roll-in.
Physical Database Design Page 47
Rolling-in large data volumes using table partitioning
Applications that need to roll-in very large data volumes can speed up the table
attachment process by ADDing rather than ATTACHing table partitions, which avoids
the need to execute SET INTEGRITY.
There is an alternative to ATTACHing a table partition: you also have the ability to
ALTER ADD an empty table to a table partition. After the empty table has been added,
you can populate the table using the LOAD utility (with read access to older data) or
using inserts (logged).
LOAD will help provide superior performance, and can load either from external files or
from a query definition using the “LOAD from cursor” capability.
For applications utilizing Deep Compression, DB2 9.5 facilitates this technique for
rolling-in data because it provides Automatic Dictionary Compression, avoiding the
need to REORG in order to compress data.
See “Best Practices: Data Life Cycle Management” white paper for more details.
Use the following roll-in and roll-out best practices:
• Use table partitioning to roll-out large volumes of data.
• ALTER ADD an empty table to a table partition and populate it using the LOAD
utility when using table partitioning for roll-in of data.
Physical Database Design Page 48
Materialized query table (MQT) best practices
An MQT table is a table whose definition is based on the result of a query. The MQT
contains pre-computed results. MQTs are a powerful way to improve response times for
complex queries, especially queries that might require some of the following types of
data or operations:
• Aggregated data over one or more dimensions
• Joins and aggregated data between tables in a group
• Data from a commonly accessed subset of data—that is, from a hot horizontal or
vertical database partition
• Repartitioned data from a table, or part of a table, in a partitioned database
• Replicated MQTs can reduce network traffic for non-partitioned tables in a DPF
In addition to speeding up query performance, MQTs can be used on nicknames of
federated data sources to maintain frequently accessed data locally. MQTs can be
maintained with SQL or Q Replication (the system-maintained MQT option for
Federated Nicknames is not supported).
MQTs are completely transparent to applications. Knowledge of MQTs is integrated into
the SQL and XQuery compiler, which determines whether an MQT should be used to
answer all or part of a query. As a result, you can create and drop MQTs, without making
application code changes, much like you can create and drop indexes without making
application code changes.
Figure 11 summarizes the characteristics of MQTs according to their refresh type. In the
table, “Optimization” indicates that the DB2 database manager will exploit the deferred
MQT where possible, when it processes a query, whereas, “No optimization” indicates
that the MQT will not be looked at, since it could be arbitrarily stale; that is, the database
manager does not know when the last refresh occurred against the MQT.
Note that MQTs can decrease INSERT performance of the base table.
To assist in problem determination, the DB2 9 explain facility indicates why an MQT was
not chosen for an access path.
Physical Database Design Page 49
Figure 11 Summary of MQT characteristics by refresh type
Use the following MQT design best practices:
• Create an MQT by using the same or higher isolation level that is used by the
queries for which you intend to use the MQT. The isolation levels, in order of
descending restrictiveness, are RR, RS, CS, and UR.
• Focus on frequently-used queries that use a lot of resources. These queries
provide the greatest opportunities for performance gains through MQTs.
• Set a limit on the number of MQTs that you are willing to maintain. There are
two reasons for this:
o Each MQT uses storage space on disk and additional UPDATE
o Each MQT adds complexity to the search for the optimal QEP, increasing
query compilation time.
Physical Database Design Page 50
• Decide on a limit for the amount of disk space available for MQTs. Generally, do
not allocate more than 10% to 20% of the total system storage of a data
warehouse for MQTs.
• Consider indexing the MQTs and execute RUNSTATS after index creation. Try to
create an MQT that is generally useful to multiple queries. Often such an MQT is
not a perfect match for a query and might require indexing. Replicated MQTs
should have the same indexing design as the base table.
• Help the query compiler find matching MQTs. (MQT routing is complex.) Give
the compiler as much information as possible by using the following techniques:
o Keep statistics on the MQTs up-to-date.
o Use RI on foreign columns in the MQT. (To avoid system overhead,
specify non-enforced RI.) Make FK columns NOT NULL.
o Avoid problematic MQT designs that make routing difficult. Try to
avoid using EXISTS, NOT EXISTS, and SELECT DISTINCT. Unless the
MQT is an exact match for a query, these predicates can make it difficult
for the query compiler to make use of the MQT.
Physical Database Design Page 51
Post-design tools for improving designs for existing
Explain facility best practices
The explain facility can show you whether design features are being used. For example, it
can show you whether indexes are being accessed in a QEP, whether partition
elimination is being used, and whether queries are being routed to MQTs.
Consider the fragment shown in Figure 12 of the QEP from the explain facility for query
20 of TPC-H 8.
Figure 12 Fragment of QEP for Query 20 of TPC-H
The QEP clearly shows that the information for PARTSUPP requires access to both the
index TPCD.UXPS_PK2KSC and the PARTSUPP table itself. How can you determine the
TPC-H: The TPC Benchmark™H (TPC-H) is a decision support benchmark. It consists of a suite of business oriented ad-hoc
queries and concurrent data modifications. The queries and the data populating the database have been chosen to have broad
industry-wide relevance. This benchmark illustrates decision support systems that examine large volumes of data, execute queries
with a high degree of complexity, and give answers to critical business questions.
Physical Database Design Page 52
Looking at operator (15) you can see that the FETCH statement requires access to the
PARTSUPP table because the index includes PS_PARTKEY and PS_SUPPKEY columns,
but does not include the PS_AVAILQTY column. This strongly suggests that by adding
the PS_AVAILQTY column to this index, you can avoid accessing the PARTSUPP table
in the subplan, thereby improving performance.
The explain output shown in Figure 13 (from DB2 9.1) indicates which MQTs the
optimizer considered but did not choose for a QEP, and explains why. The reason might
be due to cost, or due to the fact that the MQT is not similar enough to be matched.
explain plan for select c1, count(*) from t1 where c2 >= 10 group by c1;
EXP0073W The following MQT or statistical view was not eligible because one or more data filtering
predicates from the query could not be matched with the MQT: “PKSCHO "."MQT2".
EXP0073W The following MQT or statistical view was not eligible because one or more data filtering
predicates from the query could not be matched with the MQT: “PKSCHO "."MQT3".
EXP0148W The following MQT or statistical view was considered in query matching: “PKSCHO "."MQT1".
EXP0148W The following MQT or statistical view was considered in query matching: “PKSCHO "."MQT2".
EXP0148W The following MQT or statistical view was considered in query matching: “PKSCHO "."MQT3".
EXP0149W The following MQT was used (from those considered) in query matching: “PKSCHO "."MQT1".
Figure 13 Using the explain facility to understand MQT selection
Utilize the explain facility to help understand your design choices.
DB2 Design Advisor best practices
The DB2 Design Advisor is a key feature of the DB2 autonomic computing initiative. It is
a push button solution: given a workload (user provided or system detected) and,
optionally a disk constraint9, the Design Advisor recommends physical database design
options that are designed to optimize the execution of the workload provided. The
Design Advisor performs extensive “what-if” analysis, data sampling, and correlation
modeling to explore thousands of design permutations that humans cannot.
The Design Advisor has the following capabilities:
o Index selection
o MQT selection
o MDC selection
o Partitioning selection (for database partitioning)
o Industry-leading workload compression
disk constraint: A limit on the amount of disk space the Design Advisor can consider available for adding new design features. For
example, the limit might be 100MB, and that would mean that the new design aspects recommended by the Design Advisor, such
as additional indexes or MQTs, should not consume more than an additional 100MB in total.
Physical Database Design Page 53
Many customers have reported using the Design Advisor to make dramatic
improvements in physical database design, leading to performance improvements of
over five times for individual queries or entire workloads. Of course, you should not
apply the results from the Design Advisor without due consideration.
Figure 14 highlights the benefit of the Design Advisor. In this example, a decision-
support database running the TPC-H workload and data set was created with a
reasonable set of indexes, meaning that a good database designer could have come up
with this set and considered it adequate. The Design Advisor was then used to provide
additional recommendations for the database, which when applied resulted in a six-and-
a-half time performance gain.
Figure 14 Benefits from DB2 Design Advisor
MDC selection capability of the DB2 Design Advisor
For improved workload performance, use the MDC selection capability of the Design
Advisor to obtain recommended clustering dimensions for use in an MDC table,
including coarsification on base columns. Only single-column dimensions, and not
composite-column dimensions, are considered, although single or multiple dimensions
can be recommended for the table.
The MDC selection capability is enabled using the -m <advise type> flag on the
db2advis utility. The advise types (“C” for MDC and clustering indexes, “I” for index,
“M” for MQT, and “P” for database partitioning) can be used in combination with each
Physical Database Design Page 54
The MDC recommendations provided by the Design Advisor are intended to provide
optimized density and to limit the amount of table expansion that will occur when the
table is converted to MDC. The analysis operations within the advisor includes not only
the benefits of block-index access, but also the impact of MDC on INSERT, UPDATE, and
DELETE operations against the dimensions of the table.
The output includes generated-column expressions for each table for coarsified
dimensions that appear in the MDC solution, and an ORGANIZE BY clause
recommended for each table.
Use the following Design Advisor best practices:
• Provide a broad representation of your workload as input, and avoid running
the Design Advisor for one query at a time. This allows the Design Advisor to
make recommendations that apply to an entire workload rather than to a single
query, perhaps to the detriment of other parts of the workload.
• Include as input the INSERT, UPDATE, and DELETE operations that occur in
your workload so that the Design Advisor can model the drawbacks and benefits
(of adding new design features) to queries. For example, new indexes have
maintenance drawbacks in addition to their value in improving query execution
• Use the MDC selection capability of the DB2 Design Advisor (on tables that are
greater than 12 extents in size) to obtain recommended clustering dimensions for
use in MDC tables for improved workload performance.
• Use Query Patroller or the DB2 9.5 Workload Manager to automatically capture
your actual workload in a format that serves as input to the Design Advisor.
Physical Database Design Page 55
• Choose numeric datatypes over character datatypes whenever
• Use data modeling tools, such as Rational Data Architect, to
publish to a larger team.
• Store value definitions in a table, where the definitions can be
joined to values to provide context.
Table normalization and denormalization
• Normalize your tables using Third Normal Form (3NF) for most
general-purpose databases, the star schema or snowflake model
for dimensional queries, and the IBM Layered Data Architecture
for broad-based data warehousing, onLine analytical processing
(OLAP), and business intelligence (BI).
• Design a basic set of indexes using workload predicates and
primary keys (PKs) and foreign keys (FKs). Indexes are the single
most important physical database design feature. (Remember
that indexes and Refresh Immediate MQTs incur a penalty for
INSERT, UPDATE, and DELETE operations.)
Data clustering and MDC
• Use MDC to improve query performance, and for roll-in and roll-
out of data.
Physical Database Design Page 56
Database partitioning (shared-nothing hash partitioning)
• Use database partitioning to improve scalability for large BI
• Focus on both high cardinality of the partitioning key and
improved collocation of joins when selecting the partitioning key.
• Use hash-partitioning (recommended primarily for data
warehousing, which benefits from shared-nothing databases).
Table (range) partitioning
• Use range-clustered tables (RCTs) to provide fast, direct access to
• Design table partitions based on roll-in and roll-out
characteristics. Partitioning by month or financial quarter is a
UNION ALL View (UAV) partitioning
• Use UAVs when replication is needed on certain ranges of data,
while other ranges that do not require replication can benefit
from compression. UAVs allow you to have different
characteristics on different objects that underlay the view. In
general, homogeneity provides a cleaner and more maintainable
architecture. However, there are exceptions where this ability to
mix and match is needed.
• Use database partitioning for scalability of decision support,
business intelligence, data warehousing, and reporting
workloads, rather than UAVs.
• Use table partitioning to improve recovery efficiency and roll-out
Roll-in and roll-out of data with table partitioning and MDC
• Use table partitioning for roll-out, and MDC on a single
dimension for roll-in.
Physical Database Design Page 57
Database partitioning, table (range) partitioning, and MDC in the same
• Implement database partitioning, table partitioning, and MDC in
the same database design to deploy large scale applications.
• Use replicated MQTs to improve collocation of joins for database
partitioning, and query access to aggregated data.
• Help the query compiler find MQTs by keeping MQT statistics
up-to-date, by defining functional dependencies, and by defining
referential integrity (RI) (including FK columns in the MQT,
defined as NOT NULL). Avoid problematic MQT designs that
make routing difficult by avoiding the use of EXISTS, NOT
EXISTS, and SELECT DISTINCT clauses, unless the MQT is an
exact match for the query.
Post-design tools for improving designs for existing databases
• Use the explain facility to help understand your design choices.
• Use the DB2 Design Advisor to generate ideas for physical
database design improvements (for indexes, MQTs, and
partitioning). When doing so, provide as input a set of queries,
not just one query at a time. This allows the Design Advisor to
make trade-offs across the workload.
• Utilize the DB2 9.5 Workload Manager (WLM), Query Patroller,
Snapshot Scripts, or Statement Event Monitoring to automatically
capture SQL statements for input to the explain facility and to the
DB2 Design Advisor.
Physical Database Design Page 58
Physical database design is the single most important quality of any database. It affects
the scalability, efficiency, maintainability and extensibility of a database like no other
aspect of database administration. Although database design can be complex, a good
design improves performance and reduces operational risk. Mastery of this talent is
undoubtedly the cornerstone of professional database administrators.
Physical Database Design Page 59
• DB2 Best Practices
• DB2 9 for Linux, UNIX and Windows manuals
• DB2 Data Warehouse Edition documentation
• IBM Balanced Warehouse documentation
• IBM Data Warehousing and Business Intelligence documentation
• IBM DB2 9.5 for Linux, UNIX, and Windows Information Center
• S. Lightstone, T. Teorey, T. Nadeau, “Physical Database Design: the database
professional's guide to exploiting indexes, views, storage, and more”, Morgan
Kaufmann Press, 2007. ISBN: 0123693896
• Sam S. Lightstone, “Best Practices for Creating Scalable High Quality Data
Warehouses with DB2”, IBM Information On Demand 2007 Global Conference,
October 14 - 19, 2007. Mandalay Bay Resort, Las Vegas, NV
• T. Teorey, S. Lightstone, T. Nadeau, “Database Modeling & Design: Logical
Design, 4th edition”, Morgan Kaufmann Press, 2005. ISBN: 0-12-685352-5
Physical Database Design Page 60
Kevin L. Beck
Information Management Software
DB2 Information Development
Senior IT Architect
Lead Architect for SAP/DB2 Solutions
DB2 Query Optimization Development
Enterprise Data Management
Consulting IT Specialist
Senior Managing Specialist
North American Lab Services
Chief Architect DB2 LUW
Physical Database Design Page 61
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other
countries. Consult your local IBM representative for information on the products and services
currently available in your area. Any reference to an IBM product, program, or service is not
intended to state or imply that only that IBM product, program, or service may be used. Any
functionally equivalent product, program, or service that does not infringe any IBM
intellectual property right may be used instead. However, it is the user's responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in
this document. The furnishing of this document does not grant you any license to these
patents. You can send license inquiries, in writing, to:
IBM Director of Licensing
North Castle Drive
Armonk, NY 10504-1785
The following paragraph does not apply to the United Kingdom or any other country where
such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES
CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-
INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do
not allow disclaimer of express or implied warranties in certain transactions, therefore, this
statement may not apply to you.
Without limiting the above disclaimers, IBM provides no representations or warranties
regarding the accuracy, reliability or serviceability of any information or recommendations
provided in this publication, or with respect to any results that may be obtained by the use of
the information or observance of any recommendations provided herein. The information
contained in this document has not been submitted to any formal IBM test and is distributed
AS IS. The use of this information or the implementation of any recommendations or
techniques herein is a customer responsibility and depends on the customer’s ability to
evaluate and integrate them into the customer’s operational environment. While each item
may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee
that the same or similar results will be obtained elsewhere. Anyone attempting to adapt
these techniques to their own environment do so at their own risk.
This document and the information contained herein may be used solely in connection with
the IBM products discussed in this document.
This information could include technical inaccuracies or typographical errors. Changes are
periodically made to the information herein; these changes will be incorporated in new
editions of the publication. IBM may make improvements and/or changes in the product(s)
and/or the program(s) described in this publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for convenience only
and do not in any manner serve as an endorsement of those Web sites. The materials at
those Web sites are not part of the materials for this IBM product and use of those Web sites is
at your own risk.
IBM may use or distribute any of the information you supply in any way it believes
appropriate without incurring any obligation to you.
Any performance data contained herein was determined in a controlled environment.
Therefore, the results obtained in other operating environments may vary significantly. Some
measurements may have been made on development-level systems and there is no
guarantee that these measurements will be the same on generally available systems.
Furthermore, some measurements may have been estimated through extrapolation. Actual
results may vary. Users of this document should verify the applicable data for their specific
Physical Database Design Page 62
Information concerning non-IBM products was obtained from the suppliers of those products,
their published announcements or other publicly available sources. IBM has not tested those
products and cannot confirm the accuracy of performance, compatibility or any other
claims related to non-IBM products. Questions on the capabilities of non-IBM products should
be addressed to the suppliers of those products.
All statements regarding IBM's future direction or intent are subject to change or withdrawal
without notice, and represent goals and objectives only.
This information contains examples of data and reports used in daily business operations. To
illustrate them as completely as possible, the examples include the names of individuals,
companies, brands, and products. All of these names are fictitious and any similarity to the
names and addresses used by an actual business enterprise is entirely coincidental.
This information contains sample application programs in source language, which illustrate
programming techniques on various operating platforms. You may copy, modify, and
distribute these sample programs in any form without payment to IBM, for the purposes of
developing, using, marketing or distributing application programs conforming to the
application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions.
IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these
programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall
not be liable for any damages arising out of your use of the sample programs.
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International
Business Machines Corporation in the United States, other countries, or both. If these and
other IBM trademarked terms are marked on their first occurrence in this information with a
trademark symbol (® or ™), these symbols indicate U.S. registered or common law
trademarks owned by IBM at the time this information was published. Such trademarks may
also be registered or common law trademarks in other countries. A current list of IBM
trademarks is available on the Web at “Copyright and trademark information” at
Windows is a trademark of Microsoft Corporation in the United States, other countries, or
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.