On October 23rd, 2014, we updated our
By continuing to use LinkedIn’s SlideShare service, you agree to the revised terms, so please take a few minutes to review them.
DB2 on z/OS – A World Class Act! The Fastest Most Dependable Most Secure Highest Performance Most Scalable Database on the Planet! We're Bullish on DB2! DB2 for z/OS
Database Foundation Stones New Features & Combinations Can Undermine These Foundations! Understanding the Risks Is Critical! Data Integrity Data Security Data Recoverability Data Concurrency Performance & Scalability
Where Shall We Begin?
Fear of Commitment
Extents Wide Open
DB2 War Stories And Scary Tales By DATABASE BOB Think Tank Publications
Fear of Commitment Chapter 1
Cycling DB2 DB2 Stop Start DB2 Start: IRLM DM DDF Recover: Unresolved URs Apps Up When DB2 Does Lots of Recovery Apps Up Doesn't deferred restart fix all issues? Deferred Restart - Gives Consistent Startup Times DB2 Crash
Once Upon A Time In the kingdom of DB2, the time came for a systems downtime. Many tasks had been busy during the week. It came time to cycle DB2. It was cancelled, but was naughty and wouldn’t come down. After two hours, the operator forcefully killed DB2. Maintenance was applied and the system was brought up. DB2 lingered and wouldn’t wake up. After three hours, the operator killed DB2 for a second time. We tried once again to wake up DB2. This time we asked the wizards at IBM how to slay the problem. They declared that we should just let it run. Twelve hours later DB2 came up. Why did it take so long? And can this happen even today?
What Happened and Why? Sparse Updating Long URs X X X X X X X DAY1 DAY2 DAY3 Force DB2 Log1 Log2 Log3 Log1 Log2 A1 A2 - DB2 Log Records - 1) Do Records 2) Undo Records DB2 Restart: Recover Incomplete URs DB2 Up A Long Running UR Had to Roll Back Using Archive Tapes & Active Logs
DB2 Restart Deferred Restart Gets the Subsystem Up Quicker X Forward Recovery Backward Recovery App App App Log Deferred Restart A1 App Log Applications Access DB (Quicker) App App App X RECP Start DB2 Restart Complete Applications Access DB Start DB2 Restart Complete Forward Recovery Backward Recovery App App App Log Without Deferred Restart A1 App Log App App App App All Recoveries Must Complete
Turning On Deferred Restart
LIMIT BACKOUT Parameter
AUTO – Automatically recovers once DB2 is up
YES – RECOVER POSTPONED command resumes recovery
NO – Process all inflight and inabort URs
BACKOUT DURATION Parameter
log records during backward recovery before deferring
Are These Parameters Set Correctly?
DB2 Restart Questions
Does deferred restart always work?
In rare cases it fails
Won’t deferred restart fix all my problems?
Deferred pagesets still need recovery
What is status of pagesets after restart?
Most pagesets are available
Deferred pagesets are unavailable
What is the exposure?
Applications which use deferred pagesets will fail
How can we detect long running URs ?
DB2 log message - DSNR035I UNCOMMITTED UR AFTER ### CHECKPOINTS
Can I automatically cancel long running URs?
Netview can be used to do this
Trigger Happy Chapter 2
The Trigger Concept Program1 Program1 SQL1 Master SQL SQL After Trigger Database Invokes Trigger SQL SQL With Trigger Firing A Trigger - About the Cost of A FETCH
Programs are unaware
Part of UR
Firing cost of a FETCH
Add in trigger SQL
Plus cost of trigger work
Adds to SQL elapsed
Create Trigger SQL1
Once Upon A Time The king declared that dashboards would help rule the kingdom. This was a daunting task for many programs had to be changed. Triggers came to the rescue, they were quick and easy. Since 1 trigger was good, many were even better. They multiplied like rabbits and soon the whole kingdom was full of triggers. The word came down from on high that things were slow. Many programs were dragging but no changes had been made. They noticed that when triggers were added, darkness descended upon the kingdom. What had gone wrong? And how could it be fixed?
Multiple Triggers Program Master SQL Elapsed Time SQL Database Invokes Trigger Multiple Triggers Before Trigger After Trigger After Trigger
All are synchronous
All in UR
Multiple trigger’s SQL
Fire one after another
Fired in timestamp order
Serially add to elapsed time
Multiple Triggers Run in Sequence SQL
Triggers & Stored Procedure Program SQL DB2MSTR SQL Elapsed Time SQL Database Invokes Trigger After Trigger DDF Invoke Stored Procedure
Triggers & Stored Procedures
SP program load time
SP execution time
Can make calls outside DB2
Greatly extends total times
Stored Procedures Add Significant Overhead to Triggers!
Stored Procedure w/Transition Tables Program SQL MSTR SQL Elapsed Time SQL Database Invokes Trigger After Trigger DDF Invoke Stored Procedure
Using Transition Tables
Create table time
Use table time
SP program load time
SP execution time
Calls outside DB2 time
Adds significantly to times
DM Transition Tables Add Significant Overhead
How Expensive Are Triggers? Fire Trigger – Cost of a FETCH + Trigger SQL – Cost of SQL + WHEN – Invoked every time trigger event happens + Transition Variables – Cost of transition table + Invoke Stored Procedure – DDF, Start thread + Resident Stored Procedure – work in SP + Non-Resident Stored Procedure –start SP + work in SP The Costs Add Up! We Used All of the Expensive Options Statement Triggers – Cheapest Row Triggers – Cheap SP Triggers – Expensive SP Triggers w/Trans Vars - Priceless
Generally Poor Reasons to Use Triggers
Just because they’re quick
Lazy man’s solution
Easier than changing programs
Temp fixes that become permanent
For data replication
To populate summary tables
To enforce simple value constraints
To enforce RI constraints
Misusing Triggers Can Impact Performance! 9) To maintain dashboards (oops!)
Modifying Triggers DROP TRIGGER CREATE TRIGGER Update TRIGGER Refresh TRIGGER One Would Think The Reality Triggers - Easy to Create, Can Be Tough to Drop! DROP & CREATE TRIGGER How Triggers Are Maintained
Trigger – Firing Sequence Trigger 1 2007-01-01 Trigger 2 2007-01-15 Trigger 3 2007-01-30 1 2 3 Firing Order DROP & CREATE On 2007-02-15 Trigger 1 2007-01-01 Trigger 3 2007-01-30 Trigger 2 2007-02-15 1 2 3 Firing Order Is Sequence Important? Order of Creation is Firing Order “ It is also used to order the execution of multiple triggers.” CREATEDTS SYSIBM.SYSTRIGGERS
Who Is Aware of Triggers?
Yes - DB2
Should Be - DBAs
Maybe – Programmers
NO – DB2 Utilities
NO – Applications
NO – SQL
NO – DB2 Optimizer
NO – Explain
NO – Resource Limit Facility
NO – Constraints
Triggers Work Outside Programs & Triggering SQL Triggers - A Run-Time Event
Invisible Program Dependencies Programs A B C D . . . MSTR DDF DBM1 SQL SQL Trigger TT AT AT AT Stored Procedure AT Calls Outside DB2 Dependencies Make Administration Challenging!
Invisible Causes of Breakage
Calls outside DB2
RI & check constraints
Deadlocks & timeouts
Any Break Causes
All Triggering SQL to Fail!
Invisible to Programs
Trigger Traps The Scenario 1) Update trigger on TableA Starts resident stored procedure(SP-X) Inserts before image into log – TableB 2) DBA adds column to TableA 3) Days Later - SQL updating TableA starts failing Corrective Actions 1) Drop the trigger (may require down-time) 2) Drop the stored procedure 3) Add column to parameter lists & SQL 4) Recreate the stored procedure 5) Recreate the trigger When DDF reloaded, the resident SP-X the transition variables no longer matched the trigger and SP-X. The trigger had to be dropped & recreated to correct this. SP-X had to be changed to include new column in TableA DBA's Must Know Interdependencies to Avoid Trigger Traps & Outages! Why? Triggers Alters DBA
Rotate Roulette Chapter 3
The ROTATE Concept Does This Match Your Needs? 1 Oldest Limitkey A 2 Old Limitkey B 3 New Limitkey C 4 Newer Limitkey D Delete Old Data New Last Part Limitkey ROTATE DDL Command 1) Delete Oldest Partition Rows 2) Reuse Oldest Partition 1 Newest Limitkey E
Rotating Partitions ALTER TABLE table ROTATE PARTITION FIRST TO LAST ENDING AT ( limitkey s ) RESET; Simple, Beautiful & Non-Disruptive(?)
ROTATE In Action ROTATE DDL catg .DSNDB. db.ts. I0001.A00 3 catg .DSNDB. db.ts. I0001.A00 2 catg .DSNDB. db.ts. I0001.A00 4 catg .DSNDB. db.ts. I0001.A00 1 LP 4 LP 1 LP 2 LP 3 LP 4 LP 3 LP 2 LP 1 First Last After Rotate - Logical & Physical Parts Don't Match! First Logical Partition SYSTABLEPART (V8) Logical Partition After The ROTATE Physical Partition Dataset Last
A Series of ROTATEs catg .DSNDB. db.ts. I0001.A00 4 catg .DSNDB. db.ts. I0001.A00 1 catg .DSNDB. db.ts. I0001.A00 3 catg .DSNDB. db.ts. I0001.A00 2 P1 P2 P3 P4 P2 P3 P4 P3 P4 P1 P4 P1 P2 P1 P2 P3 1 st Rotate 2 nd Rotate 3 rd Rotate 4 th Rotate SYSTABLEPART Maps Logical Partitions ALTER TABLE ROTATE ... ENDING AT (‘E’) RESET; Limitkey L1 ‘ A’ ‘ B’ ‘ C’ ‘ D’ L2 L3 L4 ‘ E’ ‘ F’ ‘ G’ ‘ H’ ALTER TABLE ROTATE ... ENDING AT (‘F’) RESET; ALTER TABLE ROTATE ... ENDING AT (‘G’) RESET; ALTER TABLE ROTATE ... ENDING AT (‘H’) RESET; L1 L1 L1 L1 L2 L2 L2 L2 L4 L3 L3 L3 L3 L4 L4 L4 P1 P2 P3 P4
Once Upon A Time Version 8 was up and running well. The call came to convert to table based partitioning and reuse the oldest partition. The ROTATE command was chosen to do this non-disruptive deed. Suddenly the phone began to ring and thick darkness covered the database cubicles. A quick check revealed that two parts were mired in REORP status. User processing ground to a halt their workloads were in peril. The database guardians countered with concurrent REORG to fix the problem. This crashed and burned. Share level NONE REORG was called upon. When it finished, the sun came out and life was good again. What happened to disrupt the peace of this database kingdom?
Why Did ROTATE Set REORP? SQLCODE = -327 SQLCODE = 0 Last Part Limitkey (‘2007’) Last Part Limitkey (‘2007’) INSERT INTO tableX (D_YEAR) VALUES (‘2008’) Maximum Limitkey in last partition is not enforced by “ index based partitioning” Don't Convert to Table Based Partitioning Using ROTATE! The 1 st time only, ROTATE converts Indexed Based partitioning to Table Based partitioning. Because limitkey is not enforced in Index Based, the 1 st and last parts have to be put in REORP status to eliminate this potential issue. Table_IBP Table_TBP Table Based Partitioning Index Based Partitioning
Converting to Table-Controlled Partitioning ALTER INDEX clustering_index NOT CLUSTER; (conversion to table-controlled partitioning) COMMIT WORK; ALTER INDEX clustering_index CLUSTER; (clustering index reestablished) It’s Simple to Do This Before Rotate Limitkey of Last Partition Will Be Converted to Extreme Value!
Lowering Limitkeys in Last Part ALTER TABLE table ALTER PARTITION # ENDING AT ( limitkey s ); (This & next partition put in REORP, data outage!) REORG TABLESPACE tablespace SCOPE PENDING SHRLEVEL NONE STATISTICS COPYDDN (Data keys beyond limitkey, discarded during REORG!) Data Outage Conversion to Table Based Partitioning Causes the last part limitkey to be MAXVALUE You Will Have An Outage to Alter Limitkey
Which Partition Number is Used? -DIS DB( db ) SPACENAM( ts ) NAME TYPE PART STATUS SRG9700 TS 0002 RW -THRU 0004 SRG9700 TS 0001 RW You Must Know Which Part To Use! Order of Display X -DISPLAY DB X DB2 Datasets X Recover/Rebuild X - Rebalance X Reorg X Load X Unload X Image Copy X Other ALTERs X ALTER ROTATE Logical Part Physical Part Cmd/DDL/Utility
Rotating Logical Partitions UNLOAD TABLESPACE tablespace PART ? PP LOAD TABLESPACE tablespace PART ? PP REPLACE using dummy SYSREC -START DB( db ) SPACE ( ts ) PART( ? PP ) ACCESS FORCE COPY TABLESPACE tablespace DSNUM ? PP SHRLEVEL CHANGE (rotated partition now recoverable) ALTER TABLE table ROTATE PARTITION FIRST TO LAST ENDING AT ( limitkey s ) RESET; (1 st LP data deleted, becomes last LP ) Advisable SELECT PARTITION, LOGICAL_PART FROM SYSIBM.SYSTABLEPART WHERE DBNAME = ‘ db ' AND TSNAME = ‘ ts ' AND LOGICAL_PART = 1 1 st LP Logical Part = ? PP Physical Part ? PP 1 2 3 4 5 6 Data Outage Advisable
Avoiding Data Outages
New Tables – Use “table based partitioning”
Last Partition – Don’t set max/min limitkey (may cause -327 SQLCODEs)
Converting from “Indexed Based Partitioning”
Don’t convert to table based partitioning with ROTATE
Use …ALTER INDEX index NOT CLUSTER
Then …ALTER INDEX index CLUSTER
Plan for Outage on 1 st ROTATE
Query for values beyond limitkey before reorg
ALTER ASC/DESC limitkey from max/min value
Downtime REORG to remove REORP status
Know Logical Partitions Prior to Rotate
Mitigating Rotate Issues Outage
Knowing which physical part is 1st logical part
Long running DELETEs to empty 1 st logical part
(42 secs to ROTATE / delete 1,000,000 row partition)
ROTATE can cause an outage (REORP status)
(convert to table based partitioning or ALTER limitkeys)
Which part # to use for Commands / DDL / Utilities
Mistakenly rotating the wrong table
(DDL reuse or finger fault)
Adding partitions to ROTATEd tables
(Confusion factor on first & last parts)
With ascending keys, trying to insert null key inserts
Recoverability after a ROTATE
Attempting to REBALANCE ROTATEd tables
The Rotate Questions
Which partition will rotate next?
1 st Logical partition - query SYSIBM.SYSTABLEPART
Does rotate interrupt availability?
Yes – if indexed based partitioning (convert to table based, last 2 partitions in REORP)
Yes – If last part limitkey is altered
No – table based partitioning & limitkey doesn’t need to be altered
Can rotate be blocked?
Set MAXVALUE in limitkey of last partition
Show we use rotate to Convert to Table Based Partitioning?
Not advisable (last 2 partitions in REORP)
Does rotate behave differently with Index Based Partitioning?
Yes - the first time
Does rotate rename datasets?
No – does SQL deletes instead
Does rotate delete SYSCOPY entries?
No – puts a rotate row in SYSCOPY
If accidentally rotate can I recover?
No – rotate row in SYSCOPY prevents recovery prior to ROTATE
Extents Wide Open Chapter 4
The Extent Concept VOL001 VOL002 VOL003 catg .DSNDB. db.ts. I0001.A00 1 Extent Consolidation 1 st Extent 3 rd Extent 2 nd Extent DB2 requests space from z/OS which finds blocks of space and updates the VTOC and Catalog. Getting Allocations Is Relatively Slow Request 1 Track Request 1 Track Request 1 Track
Once Upon A Time Life in the database kingdom was good. Autonomic features had eliminated many servant duties. One day a troubled user called. A table was broken and needed to be recovered. Luckily it had only had five million rows. We proclaimed that it would be back in merely moments. Tapes were mounted, disks were spinning and the clock was ticking. Five minutes turned to ten and then to fifteen. After having many discussions with management, the recovery finally finished. Why did a small table take so long to recover?
Extent Evolution Cylinder Track Block 21 st Century DASD 20 th Century DASD z/OS Rules DASD Reality Constrained By z/OS Rules Not DASD Reality! z/OS Rules DASD Subsystem
Logical Extent Limits
Max Extents / Dataset
255 extents z/OS 1.6
7,257 extents z/OS 1.7
Max Extents / Volume
123 extents / volume
Large DASD increase this issue (Mod 27s & 54s)
5 pieces primary extent
Whatever can get on secondary
We Must Follow z/OS Rules!
Solving Systemic Extent Issues
Tolerate More Extents
z/OS 1.7 – 7,257 extents
Make It Harder to Hit Limits
SMS Extent Consolidation
Automate Extent Management
V8 Sliding Secondaries
Some Problems Still Exist!
How Extents Affect Utilities (V8) Elapsed Seconds Table 1.2 M Rows and 1 Index Extents Affect Writing Utilities! 16 25 63 256 118 276 305 115 60 45
Avoiding Extent Issues
STOGROUPs by size
z/OS Slow Allocations
Use sliding secondaries
Can cause fragmentation
STOGROUPs by size
From z/OS We Need
Faster Allocation Search
Faster Cataloging z/OS
Do Excellent Work
Part 2 Is Up To You! Please Fill Out Your Evaluations Want to Hear More? DB2 War Stories And Scary Tales By DATABASE BOB Think Tank Publications
Session: A11 DB2 War Stories and Scary Tales (Part 1) Thanks for Coming...