1. DB2 Performance Tuning for Dummies (non DBA's )
Few weeks ago, our product team reached out to me with a slow performing query. This query was taking several mins to execute. The
underlying table had appx 700 million records. There were no joins involved but the query was still very slow. In-fact all queries running
on this database were extremely slow.
After analyzing the database and making few simple changes, things started improving. The query execution came down to < 1 second.
Blogging these changes so other members of the team can also benefit from my experience! Btw I'm not a DBA by any means.
However I'm spent good 6-7 years of my career developing DB2 run-time. Please feel free to challenge/correct the information if you
don't agree to it.
My special thanks goes to the following member of the DB2 community - Shang-Min Wei, Ioana M. Delaney, David (D.)
Kalmuk and Guy M. Lohman.
Tuning the database and instance for better performance:
The following tips would help you tune your database for better performance. Database and instance tuning is typically only needed
once. You don't need to tweek these settings to improve individual query performance. However you should have a rough idea on the
category of queries that'll be run on your database.
Tip #1: Check your system hard-ware, especially the kind of storage it's using.
You can use the DB2_PARALLEL_IO registry variable to force DB2 to use parallel I/O for table spaces that only have one container, or
for table spaces whose containers reside on more than one physical disk (which is the case if the container resides on a RAID 5 or a
RAID 6 device). If this registry variable isn't set, the level of I/O parallelism used is equal to the number of containers used by the table
space. Therefore, if a table space spans three containers and the DB2_PARALLEL_IO registry variable hasn't been set, the level of I/O
parallelism used is 3.
If you have multiple disks, you can increase the performance by doing Parellel I/O. Here's how you enable your db2 instance for parallel
I/O
db2set DB2_PARALLEL_IO=*:some#
where some# = # of disks in the LUN.
For eg: if you've SAN storage, each LUN can have up to 16 disks. In that case you'll do this
db2set DB2_PARALLEL_IO=*:16
Check with your system admin. If you've only one disk, parallel I/O is not possible. In that case, no need to set this parameter.
Tip #2: Memory allocation for Share Sort, Sort Heap and Buffer Pool (CRITICAL )
Find out how much system memory you've and how much of it can be dedicated to the DB2 instance. If you've the luxury of increasing
memory, here's the math to decide how much you need.
2.1) Compute/pre-empt the size of your database. You can use this command to find the current size
db2 "call get_dbsize_info(?,?,?,-1)"
2. Eg: In our case size of db was 510 GB. If you're expecting to add more data, don't forget to add that!
2.2) Size of your share sort (SHEATHRES_SHR) should be 15% of size of database
For eg: Size of db = 510 GB
Size of share sort= 510 *15/100 = 76 GB appx
2.3) Size of Sort Heap (SORTHEAP ) = 20% of size of share sort
Size of share sort= 76 GB appx
Sixe of Sort Heap = 76*20/100 = 15 GB appx
2.4) Size of your bufferpool should be same as size of your share sort so in this case you need appx
76GB for bufferpool
Total memory you need = 2.1 + 2.2 + 2.3 + 20% buffer
= 76 + 15 +76 + 20% of (76 + 15 +76) = 167 + (20 * 167/100)
= 167 + 33 = 200 GB appx
What if you don't have that much memory? Well we had the same case on some of our development instance. We only had 100 GB
memory in total.
So here's the math I used. Note: Our queries do a lot of sort, so I couldn't compromise on sort heap size.
If you don't do much sorting, you should allocate more to buffer pool and less to share sort.
So this is what I did.
So I kept 40 GB for bufferpool.
I gave 40 GB to the share sort
and I needed 8 GB for my SORTHEAP (.20 * 40)
So 40+40+8= 88
I was left with 12 GB as buffer .
Now how to configure db2 with these changes
How to set SHEATHRES_SHR
To set SHEAPTHRES_SHR to total 76 GB, you'll have to compute the # of pages. Note: The page size is 4KB. Therefore you need
to compute how many 4KB pages you can have.
Convert 76 GB to KB first . Then divide by 4. ((76 * 1000 * 1000))/4
= 19,000,000
UPDATE DATABASE CFG USING SHEAPTHRES_SHR 19,000,000 ;
Remember to do a db2stop/db2start to make the changes effective.
How to set SORTHEAP
To set SORTHEAP to total 15 GB, you'll have to compute the # of pages again. Same math
Convert 15 GB to KB first . Then divide by 4. ((15 * 1000 * 1000))/4
= 3,750,000
3. UPDATE DATABASE CFG USING SORTHEAP 3750000;
Remember to do a db2stop/db2start to make the changes effective.
How to alter your bufferpool
Read the following article to understand about your table-spaces and buffer-pool.
http://www.ibm.com/developerworks/data/library/techarticle/0212wieser/
Once your know how your table-spaces are laid down, find the relevant buffer-pool where all the key table resides.
Here's the command to find key info about your buffer-pools.
SELECT SUBSTR(BPNAME,1,10) AS BPNAME,SUBSTR(NPAGES,1,5) AS NPAGES,PAGESIZE,ESTORE,
NUMBLOCKPAGES,BLOCKSIZE,SUBSTR(NGNAME,1,5) AS NGNAME FROM SYSCAT.BUFFERPOOLS;
If you want to increase the size of your buffer-pool, here's how you can alter it.
For eg: if you want to increase the 32K buffer-pool and set it to 75 GB, use the same math as used for sort heap.
If you're altering the 32K buffer-pool, each page is 32K. Therefore total # of pages would be 75GB/32K = 2343750
alter bufferpool "BP32K" immediate size 2343750 ;
I would recommend do an activate on your database after changing the buffer-pool.
db2 activate db <db_name>
Tip #3: Leveraging Multiple CPU's (CRITICAL )
If you've multiple cpu's on your system (you can use nproc on unix ), you can tune db2 to perform better using these additional
configuration parameters.
I'm including links to help you understand how/why this parameters are important.
3.1) Enable Intra-partition parallelism ( INTRA_PARALLEL) :
http://db2commerce.com/2012/03/21/parameter-wednesday-dbm-cfg-intra_parallel/
UPDATE DBM CFG USING INTRA_PARALLEL YES;
3.2) Specify the degree of intra-partition parallelism (DFT_DEGREE) :
The default value is 1. A value of 1 means no intra-partition parallelism. A value of -1 (or ANY) means the optimizer determines the
degree of intra-partition parallelism based on the number of processors and the type of query. We're using "ANY" but that mayn't be the
best option for every system. This article might help in deciding the correct value - http://db2commerce.com/2012/03/21/parameter-
wednesday-dbm-cfg-intra_parallel/
UPDATE DATABASE CFG USING DFT_DEGREE ANY;
4. Tuning the tables and queries for better performance:
Database tuning just gets you a kick start. For optimal performance, you've to pay attention to the schema as well the queries. Here are
few tips for designing your tables and queries for good performance.
Tip #4: Choose your data type and column length carefully
If your data values are only 8 characters long, you shouldn't be creating columns with Varchar(64). It's true that DB2 doesn't pad the
columns with extra characters when storing the data. However when it comes to index generation (for row- organized) or
compression/encoding (for column-organized), the padding is done to accommodate values of maximum length. In case of BLU the
compression/encoding only works well if either your data is numeric or your data length is <64 characters. If your table contains large
columns you might be better off going with the row-organized table.
Disclaimer: We evaluated BLU with DB2 v10.5.0.1", "s130816". The DB2 group is aware of the BLU performance with large columns.
So if you're using a latter release, do give a shot to BLU
Tip#5 Design your index smartly (applies only to row-organized tables)
Creating an index on every column in the table is a very bad idea. For good performance, your table should have maximum of 4
indices. Consider creatingcomposite indices or
clustered indices when possible. Here are some ideas with examples
1) Use composite index to match the maximum number of columns. - The order of columns in the index must match the order in
WHERE clause. Otherwise the access plan won't even
pick the index.
For eg: Query is "select A, B from tab_name where A='Test' AND B='Blue'"
In this case, creating a composite index on A,B (in the same order) gives you the best performance.
CREATE INDEX I1 ON tab_name( A,B,C) ALLOW REVERSE SCANS PAGE SPLIT SYMMETRIC COLLECT SAMPLED
DETAILED STATISTICS COMPRESS NO INCLUDE NULL KEYS
2) Include additional columns to avoid I/O to the data pages.
If only one or two additional columns are required in addition to the columns in a composite index, those columns should be
considered for inclusion in the composite
index. This will remove any I/O requirement, hence improving the performance of the query.
For eg: Query is "select A, B, C, D from tab_name where A='Test' AND B='Blue'"
In this case, creating a composite index on A,B (in the same order) and including C, D will give you the best performance.
CREATE INDEX I1 ON tab_name( A,B,C,D) ALLOW REVERSE SCANS PAGE SPLIT SYMMETRIC COLLECT SAMPLED
DETAILED STATISTICS COMPRESS NO INCLUDE NULL KEYS
5. I would recommend reading this article for other techniques
http://www.toadworld.com/platforms/ibmdb2/w/wiki/6624.designing-composite-indexes.aspx
NOTE: After creating/altering indices, you must run runstats on the tables with all index.
Tip#6 Gather distribution statistics on columns commonly included in the predicates
(where clause):
If your columns are not evenly distributed. For eg: a column of a 1-million-row table that has 1,000 distinct values, one of which appears
in 900,000 of the 1 million rows
It's a wise decision to collect distribution statistics. The following article would help you understand what kind of statics are available and
why they're so important.
http://www.ibm.com/developerworks/data/library/techarticle/dm-0606fechner/
In our case, most queries made use of 4 columns and had used equality operations. For eg: instanceName_up='BARACK OBAMA'
Therefore I chose to collect the frequency statics (num_freqvalues). We also had few range bound queries. Therefore I decided to also
collect the quantile statics for the same columns.
Here's the sample sql I used
RUNSTATS ON TABLE "ADS"."SIRE1.1"
ON COLUMNS (
("object/instanceName_up","object/typeType","subject/instanceName_up","subject/typeType") ) WITH
DISTRIBUTION ON COLUMNS ( "object/instanceName_up" NUM_FREQVALUES 15 NUM_QUANTILES 25 ,
"object/typeType" NUM_FREQVALUES 15 NUM_QUANTILES 25 , "subject/instanceName_up" NUM_FREQVALUES 15
NUM_QUANTILES 25 , "subject/typeType" NUM_FREQVALUES 15 NUM_QUANTILES 25 ) AND SAMPLED DETAILED
INDEXES ALL ALLOW WRITE ACCESS
TABLESAMPLE BERNOULLI ( 40.0 ) REPEATABLE ( 50 ) UTIL_IMPACT_PRIORITY 50 ;
Once you've tuned the database and the schema, you'll should experience drastic improvement in the performance of all your queries.
The final task left for you is to write efficient queries.
I'll cover that in the next blog if folks are interested.
Hope you find this article useful!