Breaking the DB2 Platform Barrier
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Breaking the DB2 Platform Barrier

on

  • 1,000 views

 

Statistics

Views

Total Views
1,000
Views on SlideShare
1,000
Embed Views
0

Actions

Likes
0
Downloads
30
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Breaking the DB2 Platform Barrier Document Transcript

  • 1. Breaking the DB2 Platform Barrier An Examination of the Architectural Differences of DB2 UDB for z/OS vs. DB2 UDB for Linux/Unix/Windows written by Jim Wankowski Quest Software, Inc. White Paper
  • 2. © Copyright Quest® Software, Inc. 2006. All rights reserved. This guide contains proprietary information, which is protected by copyright. The software described in this guide is furnished under a software license or nondisclosure agreement. This software may be used or copied only in accordance with the terms of the applicable agreement. No part of this guide may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording for any purpose other than the purchaser's personal use without the written permission of Quest Software, Inc. WARRANTY The information contained in this document is subject to change without notice. Quest Software makes no warranty of any kind with respect to this information. QUEST SOFTWARE SPECIFICALLY DISCLAIMS THE IMPLIED WARRANTY OF THE MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Quest Software shall not be liable for any direct, indirect, incidental, consequential, or other damage alleged in connection with the furnishing or use of this information. TRADEMARKS All trademarks and registered trademarks used in this guide are property of their respective owners. World Headquarters 5 Polaris Way Aliso Viejo, CA 92656 www.quest.com e-mail: info@quest.com U.S. and Canada: 949.754.8000 Please refer to our Web site for regional and international office information. Updated—October 10, 2006 WPD_BreakingDB2PlatfBarrier_101606_AG
  • 3. CONTENTS INTRODUCTION ..........................................................................................1 BASIC COMPONENTS...................................................................................2 INSTALLATION .............................................................................................. 2 SYSTEM CATALOG.......................................................................................... 3 ACCESSING DB2........................................................................................... 3 TERMINOLOGY DIFFERENCES .....................................................................4 DEFINING THE SYSTEM CATALOG ........................................................................ 5 STORAGE MANAGEMENT .............................................................................6 Z/OS ........................................................................................................ 6 Volume .................................................................................................. 6 LUW......................................................................................................... 6 Container ............................................................................................... 6 OBJECT COMPARISONS ...............................................................................7 BUFFERPOOLS .............................................................................................. 7 z/OS ..................................................................................................... 7 LUW ...................................................................................................... 7 Self-Tuning Memory ................................................................................ 7 DATABASES ................................................................................................. 8 TABLESPACES............................................................................................... 9 z/OS ..................................................................................................... 9 Types of z/OS Tablespaces– GUIDELINES ................................................................. 17 EXPLAIN PROCESSING ................................................................................... 18 z/OS ................................................................................................... 18 LUW .................................................................................................... 18 PARALLELISM ............................................................................................. 19 TYPES OF PARALLELISM ................................................................................. 20 Performance Monitoring ......................................................................... 21 UTILITIES ................................................................................................. 21 i
  • 4. BACKUP AND RECOVERY ................................................................................ 22 Backups............................................................................................... 22 Recovery Infoii
  • 5. White Paper INTRODUCTION The popularity of DB2 Universal Database (UDB) running on distributed platforms continues to grow. The recent growth and popularity of DB2 on distributed platforms has resulted in a shortage of experienced non-mainframe DB2 database administrators (DBAs). InformationIT departments today have to deal with tightening budgets and shrinking staffs. The luxury of being a single-platform DBA is becoming a thing of the past. Many DB2 mainframe DBAs find themselves supporting DB2 on these distributed platforms, resulting in a huge learning curve. It is essential for the DB2 DBA of the new millennium to be well versed on running DB2 on multiple platforms. This paper is geared toward any DB2 DBA responsible for having to support DB2 on multiple platforms, whether you’re a Z Series Operating System (z/OS) DBA with little or no knowledge of distributed platforms or a distributed DBA with little or no knowledge of z/OS. It will cover some of the basic terminology for the different platforms and how they differ, as well as the key architectural differences and administrative issues. These topics are based on DB2 UDB for z/OS V8 and DB2 UDB for Linux/Unix/Windows (LUW) V9.1. 1
  • 6. Breaking the DB2 Platform Barrier BASIC COMPONENTS Most of the object types and functionalities are very similar across the platforms. You will notice that the object types are essentially identical from the database (DB) down, but differ significantly in regards to storage management. We will cover this in more detail in the storage management section. Z/OS LUW • Subsystem • Instance • VCAT/Volume • Container • Stogroup • N/A • Database • Database • Tablespace • Tablespace • Creator • Schema • Table • Table • Index • Index • View • View • Packages • Packages • Plans • N/A Figure 1: Basic components of databases Installation When selecting the actual software to install for DB2, there is basically one choice for z/OS. For the LUW environments, there are several editions to choose from. The Enterprise Server Edition (ESE) is the most common. When installing DB2 in either environment you can choose between a standard environment or a partitioned environment. The concept of partitioning on the different platforms is essentially the same in that you are harnessing the processing power of multiple processors; however, the actual architecture is quite different. This will be discussed in more detail later in this paper. Z/OS UNIX, WINDOWS DB2 UDB for z/OS V8 DB2 UDB for LUW V9.1 Editions: • Data Warehouse Edition • Enterprise Server Edition • Workgroup Edition • Express Edition • Personal Edition • Universal Developers Edition • Personal Developers Edition Figure 2: Editions of DB2 2
  • 7. White Paper System Catalog The system catalogs are very similar between the platforms; however, where they reside and how they are accessed are quite different. To extract information from the z/OS catalog, you typically run queries directly against the appropriate tables. In distributed environments, there are a series of views defined with the schema of SYSCAT for retrieving catalog information and a series of updateable views with a schema of SYSSTAT for updating optimizer-related stats in the catalog. How the catalogs are defined and where they reside will be discussed in the upcoming architectural overview section. Z/OS LUW • SYSIBM.xxxx • SYSIBM.xxxx • Most optimizer related fields are • SYSCAT updateable o Read-only views defined for catalog base tables • SYSSTAT o Updateable set of views o Primarily used for access path manipulation Figure 3: System Catalogs Accessing DB2 IBM provides a core set of tools with the database. Accessing DB2 z/OS is done through a supplied application called DB2 Interactive (DB2I). This facility provides basic functionality for running queries, issuing commands, generating utilities and preparing programs. DB2 on distributed platforms comes with a graphical user interface (GUI) toolset called Control Center and Health Center. Control Center provides basic functionality for doing rudimentary tasks within DB2. Health Center allows you to set up monitoring parameters for autonomic computing. Both Control and Health Centers have the ability to connect to z/OS as a separate add-on. Z/OS LUW • DB2I • Control Center o DB2 tool set (3270 based) o GUI tool set for administration o SPUFI • Command center o DCLGEN • Command line processor o Bind/Rebind • Command window o Command Processor • Script center o Utilities • Visual Explain o Defaults • Health Center • Control Center o Facilitates autonomic computing • Health Monitor Figure 4: Accessing DB2 3
  • 8. Breaking the DB2 Platform Barrier TERMINOLOGY DIFFERENCES When dealing with a multi-platform DB2 environment, you will encounter common terminology with completely different meanings. It is important to note the platform when referring to these terms. Z/OS LUW System Managed Storage (SMS) software for System Managed Space (SMS) – This is a managing all the Direct Access Storage Devices tablespace (TS) allocation parameter. See (DASDs) in an S/390 environment. the table space section for more details. A storage group definition in an SMS environment is typically defined with a volume list containing an asterisk (*). This in turn allows SMS to decide which volume to put the dataset on. Extent – Physical extension of a dataset based Extent – A block of data pages that gets on a secondary allocation. This is not limited to allocated based on the EXTENTSIZE DB2 data sets. When defining any type of data parameter of the tablespace definition. See set in z/OS, a secondary value is specified. In the tablespace section for more details. the event of the datasets primary allocation becoming full, an extension is taken with the amount of space specified on the secondary allocation. Figure 5: Common terms, different meanings The installation of DB2 in the z/OS environment is known as a subsystem. When a subsystem is created, four system databases are created, bufferpools (BPs) are defined, and there is one configuration file called DSNZPARM. There are typically many databases defined within a subsystem. All applications running in this subsystem share the system resources such as catalog, databases, BPs etc. The installation of DB2 on the distributed platforms is called an instance. Notice that there aren’t any system type objects defined at this time. This is where the architecture really becomes different. Z/OS LUW • Subsystem – Logical database environment • Instance – Logical database server • Four databases created environment o DSNDB06 • Also referred to as a NODE o DSNDB01 • One to many databases o DSNDB04 • Database Manager Configuration File o DSNDB07 • Memory Structures • Database Configuration o DSNZPARM • Many databases Figure 6: Different Terms, Similar Meaning 4
  • 9. White Paper Defining the System Catalog z/OS As mentioned above for z/OS, the system catalog for a subsystem gets defined within the system database DSNDB06 when DB2 is installed. This catalog contains the metadata for all objects defined within the subsystem. LUW For distributed environments, a system catalog gets defined for every database created. Z/OS LUW One common system catalog for all Three system tablespaces are created by default for databases defined within a subsystem every database • DSNDB06 – Catalog database 1. SYSCATSPACE – Contains system catalog tables 2. TEMPSPACE – Holds temp tables used by UDB 3. USERSPACE1 – Contains user tables unless tablespace specified (like DSNDB04) Figure 7: Overview of System Catalogs This pictorial view of the architectures clearly depicts the significant differences between the different platforms. DB2 for z/OS is a “share everything” type of architecture vs. a “share nothing” architecture on the distributed platforms. Each database defined within an instance has its own system catalog, bufferpools, log files and configuration file. An instance is very similar to a subsystem on z/OS. z/OS LUW D D Instance_1 DB2PROD S B Catalog N M Catalog DBCONFIG PRODDB1 Z C PRODDB1 O Log BP's Log P A N PRODDB2 F Catalog DBCONFIG R BP's M I PRODDB2 G Log BP's DB2TEST Instance_2 D D Catalog B Catalog DBCONFIG S TESTDB1 N M TESTDB1 C Log BP's Log Z P O TESTDB2 A N Catalog DBCONFIG BP's R F TESTDB2 I Log BP's M G Figure 8: Pictorial overview of architectures 5
  • 10. Breaking the DB2 Platform Barrier STORAGE MANAGEMENT One of the biggest differences between the platforms is storage management. The physical devices for z/OS are known as volumes. The physical devices for distributed platforms are known as containers. A container in itself is not a physical device but a representation of how the space is defined. DB2 on distributed platforms allows for three different ways of defining containers depending on the type of tablespace defined. z/OS Volume Physical storage device for DB2 z/OS. A volume can contain one or many tablespaces or indexspaces (Iss) as well as non-DB2 files. • Terminology • DASD – Direct Access Storage Device • Logical disk drives • VolSer – Volume serial • This is a name identifying the disk pack i.e. DB2001 • Storage Group • DB2 object – A logical grouping of volumes – Can be used by more than one TS or IS – N/A on LUW LUW Container A container is only applied to a single tablespace. A tablespace can have multiple containers if it is defined as Database Managed Space (DMS). The way a container can be defined depends on how the tablespace is defined: • SMS managed • Directory name • D:MYTS • DMS managed • Raw device 1. E: • File name 1. D:SODADBSODA.DATA.DMS Determining when to choose SMS or DMS will be discussed in further detail in the tablespace section. 6
  • 11. White Paper OBJECT COMPARISONS This section will discuss the differences in object architecture across the platforms. Bufferpools Bufferpools are areas of virtual storage where DB2 maintains data pages to satisfy queries without having to do physical I/O to the DB2 tables. The goal is to keep as many frequently accessed data pages in memory as possible. Bufferpools are by far the most critical memory area when it comes to performance for all platforms of DB2. A tremendous boost in performance can be obtained with properly tuned bufferpools. z/OS In z/OS V8, you have the option of allocating up to 50 4K bufferpools and 10 8K,16K, and 32K bufferpools. Minimally, you have to allocate one 4K buffer (BP0), one 8K (BP8K0), one 16K (BP16K0), and one 32K buffer (BP32K). There are quite a few customizable thresholds within z/OS bufferpools to allow you to tailor a bufferpool to a specific application, such as random access vs. sequential access, frequent updates, etc. There are companies that actually change their bufferpool settings throughout the day to adjust for process changes. Refer to the DB2 Administration Guide for details on how to tune these parameters. LUW An LUW database must have at least one bufferpool. A default bufferpool (IBMDEFAULTBP) is created automatically when a new database is created. A series of hidden bufferpools are also created in 4K, 8K, 16K, and 32K page sizes. These are there in case a normal bufferpool of the required page size is unavailable due to insufficient memory, or the normal bufferpool is not active for some reason. These hidden bufferpools do not appear in the system catalog or bufferpool system files. A new auto-tune feature in V9.1 will allow DB2 to automatically size your bufferpools based on transaction load. This takes a lot of the guesswork out of tuning your bufferpools. Self-Tuning Memory DB2 UDB for LUW V9.1 introduced the concept of self-tuning memory. The feature simplifies the task of managing package cache, sort heap, bufferpools, and locklist. By specifying the database_memory parameter as AUTOMATIC, DB2 will dynamically adjust these memory parameters based on database workload. This parameter can be left on or, once a typical workload has been set, the values can be frozen by turning off the automatic parameter. 7
  • 12. Breaking the DB2 Platform Barrier Z/OS L,U,W • 80 bufferpools available • Defined within a database o 50 4K • Can only be used by objects within database o 10 8K, 16K, 32K • Defined via Data Definition Language (DDL) o Become active when space is • Memory size is only configuration assigned • Number limited by amount of memory • Defined at install time in DSNZPARMS available • Highly configurable • Shared by all objects in subsystem Figure 9: BP Overview Databases The database is the highest level in the relational hierarchy and can be described as the “wrapper” which identifies a grouping of tablespaces, tables, indexes, views, etc. One of the biggest differences here, architecturally, is the fact that in z/OS, we have a single catalog containing all information regarding all databases in the subsystem, whereas in LUW databases, a new catalog is created for each database and contains only information about that specific database. Z/OS LUW • Logical grouping of DB2 objects • Logical grouping of DB2 objects o Does not consume resources • Typically one database/instance • Many DBs in subsystem • More like an z/OS subsystem • Meta data for all databases stored in one • Catalog for each database defined within system catalog database o SYSCATSPACE o TEMPSPACE o USERSPACE • Bufferpools defined in database • Database configuration file Figure 10: Database Overview 8
  • 13. White Paper Tablespaces z/OS Tablespaces in z/OS are generally segmented and partitioned. Four types of tablespaces can be defined: • Simple • Segmented • Partitioned • Large (DSSIZE) Two types of allocation methods exist for z/OS: • Virtual Catalog(VCAT) • Stogroup VCAT method is very seldom used other than for defining system catalogs. It requires that the Virtual sequential Access Method(VSAM) data set be pre-defined via an IDCAMS job prior to the tablespace being created. Stogroup defined tablespaces allows DB2 to do all the VSAM allocation work for you. It actually runs the IDCAMS job under the covers when the create statement is executed. When a tablespace is created, a VSAM file is defined with the following format: • VCAT.DSNDBC.DBNAME.TSNAME.I0001.A001 • VCAT.DSNDBD.DBNAME.TSNAME.I0001.A001 where: • VCAT – Typically the subsystem name • DBNAME – Database name • TSNAME – Tablespace name • A001 – Partition or dataset number (A001, A002, etc.) Types of z/OS Tablespaces Simple Tablespaces This was the only type of tablespace available in the early releases of DB2. They are limited in their usefulness, in that, if you have multiple tables in a tablespace you have the potential of lock contention issues. Remember that in the event of a lock on TableA, you also will be locking pages from TableB and TableC. And in the event of a TS scan you must scan the data pages for all three tables. Support of simple tablespaces will be dropped in V9, as they really aren’t used much anymore. Segmented tablespaces offer superior performance. 9
  • 14. Breaking the DB2 Platform Barrier Characteristics of a simple tablespace • One to many tables • Smallest unit of recovery is the tablespace SIMPTS1 A B B B B C A B C C C B A A TableA TableB TableC Page Figure 11: Simple tablespace Segmented Tablespace in z/OS Segmented tablespaces allow for efficient access of data, particularly when multiple tables are defined into a tablespace. The table data for each table is physically separated into segments. A segment is a block of pages from 4 – 64 similar to an extent in UDB. I/O performance in general is much better with a segmented vs. a simple tablespace. Characteristics of a segmented tablespace: • Can contain multiple tables, but rows are not commingled. • Space is divided into groups of pages called segments. • Segsize = 4 to 64 pages each. • Each segment contains rows for only one table, and each table can have different locking strategy. • Segmented tablespaces read-only relevant pages during TS scan. • Automatically reclaim space after drop table. • Much more efficient for mass deletes. TableA TableA TableA TableB TableB TableB Data Data Data Data Data Data TableA TableA TableC TableC TableA TableA Data Data Data Data Data Data TableC TableA TableB TableA TableC TableB Data Data Data Data Data Data Figure 12: Segmented tablespace 10
  • 15. White Paper Partitioned Tablespace Partitioned tablespaces are typically used for very large tables. Only one table can be defined into a partitioned tablespace. A partitioned tablespace can have up to 254 partitions. Each partition is a separate physical data set that can be placed on different volumes for optimal performance. Partitioning allows for easier maintenance such as loads, copies, or reorgs because each partition can be acted on independently. Characterisitics of a Partitioned tablespace • One table per tablespace. • Each partition is a separate dataset. • One – 254 partitions. • Each partition can be on a separate volume. • Data placement is controlled by partitioning index. • Partition independence allows utilities to be run on individual partitions. • Query Parallelism. Partts1 Part1 SG1 Sales Data for Jan-Apr DB2VOL1 Part2 SG2 Sales Data for May-Aug DB2VOL2 Part3 SG3 Sales Data for Sep-Dec DB2VOL3 Figure 13: Partitioned tablespace 11
  • 16. Breaking the DB2 Platform Barrier LUW One type of Tablespace Three Categories • Regular • Temporary • Long Two Allocation Methods SMS • Directory only DMS • File • Device SMS – System Managed Space allows for the operating system to allocate space for the table as needed. No space parameters are specified. This is the easiest method as far as storage management. It is good for small tables or tables which grow for short periods and shrink back down. DMS – Database Managed Space requires the specification of space when the tablespace is created. This space is immediately taken and reserved for use by the tablespace. The type of tablespace chosen depends on the characteristics of the data stored within the tablespace. While DMS tablespaces clearly provide more flexibility for storage capacity, SMS tablespaces are generally recommended for temporary tablespaces and catalog tablespaces. In addition to understanding the types of tablespaces, it is important to understand how data is managed within the tablespace. Tablespaces can be allocated in 4k, 8k, 16k, and 32k pages. Row size, random vs. sequential access, and several other factors must be evaluated to determine the optimal page size for the tablespace. Pages are grouped into allocation units called extents. Each time the tablespace needs to allocate additional storage, the extent size is used to determine the size. During insert activity, DB2 will write to a container until the extent size has reached capacity; at that point, another extent will be allocated and continue the write activity. 12
  • 17. White Paper Tables The biggest difference in the table definitions between z/OS and LUW is the way index definitions are handled. In z/OS, there are no predefined indexspaces as there are in LUW. Z/OS LUW • One to many tables defined in simple or • One to many tables can be defined within a segmented tablespaces tablespace • Tables and Indexes are independent of each • Indexspace directly tied to table definition other and can exist in same tablespace Figure 14: Table comparison Indexes In z/OS there are no predefined indexspaces as there are in LUW. When the CREATE INDEX statement is executed the same rules apply as those when creating a tablespace. You can either specify it to be VCAT or storage-group defined. The underlying VSAM file is then created. When creating a table in LUW, you must have a tablespace predefined for both the table and any indexes you might add to the table. The indexspace specification is part of the table definition. Therefore all indexes for the table use the same indexspace. When using SMS managed tablespaces, the indexspace has to be the same as the tablespace. Z/OS LUW • Unique • Unique • Non-unique • Non-unique • Clustering* • Clustering • Partitioning • Multi-Dimensional (V8) * - Non partitioned TS only Figure 15: Index comparison 13
  • 18. Breaking the DB2 Platform Barrier Index Structures z/OS • Each index has its own VSAM dataset • Indexspace created when CREATE INDEX executed o No CREATE INDEXSPACE DDL like tablespaces o Only one index per indexspace o VSAM dataset name can be a little cryptic for indexes VCAT.DSNDBC.DBNAME.IXNAME.I0001.A001 VCAT.DSNDBD.DBNAME.IXNAME.I0001.A001 Where: o VCAT = Typically the subsytem name o DBNAME = Database name o IXNAME = 8 character representation of IX name o A001 = Dataset number (A001, A002, etc.) • 2 types of allocation methods o VCAT o STOGROUP LUW • Indexes are dependent on tables. Indexspace must be specified when table created. o All indexes for table use one tablespace o Indexspace is predefined before IX’s are created o Indexes can be defined in same tablespace as table Required for SMS 14
  • 19. White Paper PARTITIONING Partitioning is the process of breaking large volumes of table data into multiple parts based on a key range. This provides multiple benefits. First, it provides manageability in that the partitions are independent of one another when it comes to maintenance such as reorgs, runstats, copy, etc. Second is the performance benefit of being able to place the partitions on different I/O devices as well as access data from multiple partitions concurrently via I/O parallelism. Prior to DB2 LUW 9.1, the concept of partitioning was completely different between the platforms. z/OS partitioning is based on a partitioning key with a data range. LUW partitioning is at the database level and a partitioning key is specified, but not a range. The partitioning is done via a hashing algorithm which automatically partitions the data across the different nodes. This is done with the Data Partitioning Feature (DPF) available with DB2 ESE. In 9.1, you are now able to partition tables much like z/OS with a partitioning key range and get the same administrative/performance benefits of partition independence Partitions are created with individual tablespaces which can then be independently reorged, backed up, etc. Z/OS DPF • Single table • Database partitioned • One-254 partitions o Multiple tables in database • Partitioning key range controls what partition o Partitions usually on separate machines data resides in o Hash or range partitioning • Each partition can be on separate device • Controlled via Database Partition Groups • DB Part = Node o Data o Indexes o Config files o Logs • Specify key but not data values Figure 16: Partitioning overview Database Partition Group • Formerly nodegroup • A set of one or more database partitions • A tablespace exists within a nodegroup • More than one table can be in a nodegroup • Rows are distributed across partitions of nodegroup • Partitioning Map controls data placement • Hash function places rows on a given partition • Data will be evenly distributed across nodes in nodegroup 15
  • 20. Breaking the DB2 Platform Barrier ADMINISTRATION The basic concepts of database administration are the same for any relational database management system (RDBMS). This section will discuss the similarities and differences in maintaining DB2 across multiple platforms. Optimizer The Optimizer is what DB2 uses in determining the “roadmap” to use in order to retrieve the results set for a particular SQL statement. OS/390 UNIX/NT • Fixed optimization • Much more flexible than z/OS • HINTS allow for some flexibility o Seven levels of optimization o Mainly used to maintain old access path o Adjusted based on query o Must be turned on at install time complexity o Need to modify PLAN_TABLE o Manual Process Figure 17: Optimizer overview DB2 Hints Hints was a concept introduced in z/OS with DB2 V6. Hints allow you to manually update the access plan information in the Plan Table to force DB2 to use a specific access path. This process is mainly used to maintain a specific access path when upgrading to a newer version of DB2. 16
  • 21. White Paper Optimization Class – Guidelines The LUW Optimizer gives the DBA much more flexibility in deciding how much resource should be utilized in optimizing a query. The more complex the query, the higher the optimization level should be used. Remember that the higher the optimization level, the more resources are consumed for optimization. This could be a significant factor when dealing with dynamic SQL. LEVEL RECOMMENDATION 0 Minimal amount of optimization. Only recommended for very simple SQL accessing well indexed tables. Only nested loop joins and IX scans enabled. 1 Similar to 0 except Merge Scan and TS scan enabled. 2 Recommended for very complex queries which are infrequently executed in a decision support or OLAP environment. 3 Closest to z/OS optimizer. Recommended for queries with four or more joins. 5 DEFAULT – Most cost effective method for mix of simple and complex queries. Optimization will be automatically reduced for complex dynamic SQL if optimizer determines that the resources are not necessary. 7 Same as five except optimization not reduces for complex dynamic SQL. 9 Used to determine whether more comprehensive optimization can generate better access plan for very complex, long running queries using large tables. Figure 18: Optimizer “Rules of Thumb” The more complex the query, the higher the optimization level should be used. Remember that the higher the optimization level, the more resources are consumed for optimization. This could be a significant factor when dealing with dynamic SQL. 17
  • 22. Breaking the DB2 Platform Barrier Explain Processing z/OS PLAN_TABLE DSN_STATEMENT DSN_FUNCTION Figure 19: z/OS Explain tables • The EXPLAIN statement was extended to insert information into two new tables for V6 • DSN_FUNCTION table is useful for finding out information about function resolution • DSN_STATEMENT table is useful for finding out the estimated cost of SQL statements Unlike the plan table, neither the function table nor the statement table has to exist to use EXPLAIN. LUW EXPLAIN_INSTANCE EXPLAIN_STATEMENT EXPLAIN_OPERATOR EXPLAIN_PREDICATE EXPLAIN_STREAM EXPLAIN_ARGUMENT EXPLAIN_OBJECT Figure 20: Distributed Explain tables 18
  • 23. White Paper • EXPLAIN_ARGUMENT: Represents the unique characteristics for each individual operator. • EXPLAIN_INSTANCE: Main control table for all Explain information. Each row of data in the Explain tables is explicitly linked to one row in this table. Basic information about the source of the SQL statements being explained and environment information is kept in this table. • EXPLAIN_OBJECT: Contains data objects required by the access plan to satisfy the SQ statement. • EXPLAIN_OPERATOR: Contains all the operators needed to satisfy the SQL statement. • EXPLAIN_PREDICATE: Identifies which predicates are applied by a specific operator. • EXPLAIN_STATEMENT: Contains the text of the SQL statement in two forms. The original version entered by the user and a rewritten version generated by the compilation process. • EXPLAIN_STREAM: This table represents the input and output data streams between individual operators and data objects. Parallelism Parallelism is another concept that is radically different across platforms. Parallelism is restricted to EEE environments for distributed DB2. z/OS SYSPLEX CPU1 CPU2 Coupling Workfile DB Facility Workfile DB DB2A DB2B Log Log DB2 Catalog BSDS BSDS DASD 19
  • 24. Breaking the DB2 Platform Barrier ESE Data Partitioning Feature MPP Fast Communication Manager CPU1 CPU2 CPU3 CPU4 DB Part 0 DB Part 1 DB Part 2 DB Part 3 Log Data Log Data Log Data Log Data Figure 21: Parallelism examples There are two types of configurations when using DPF: • Massively parallel processor (MPP) (pictured in Figure 21) • Multiple machines with single processors grouped together in a cluster • “Shared nothing” configuration • Symmetric multi-processor (SMP) • Multiple processors on a single machine Types of Parallelism z/OS I/O—DB2 concurrently pre-fetches data from multiple partitions. CPU—DB2 starts multiple tasks in parallel to process query. SYSPLEX—Same as CPU except tasks are spread across machines in sysplex. (figure 24) LUW • I/O • Multi-container TS • Query • Intra-partition (SMP) • Parallelism within single partition • Inter-partition (EEE/MPP) • Parallelism across multiple partitions 20
  • 25. White Paper Performance Monitoring z/OS z/OS traces offer much greater detail but can also cause much more overhead. Records can be output to either SMF or GTF record types. OS/390 UNIX/NT Instrumentation Facility Component (IFC) The amount of memory used for database • Statistics monitoring is configurable in the DBM o Global statistical data configuration file using the monheapsz • Accounting parameter. o Detail info for specific application Control center, CL, or third-party monitor used • Audit to view trace output. o Table access audits • Snapshot Monitor o Requires AUDIT keyword on table o Show status of database for an definition instant in time • Performance • Event Monitor o Most detailed $$$ o Historical status over time o Only use for short periods Databases • Monitor Tablespaces Makes trace data available for monitoring Connections applications Tables Statements Transactions Deadlocks Figure 22: Monitoring facilities Utilities As you can see, the utilities available are very comparable across the platforms; however, the utilities for z/OS are still considerably more robust. Z/OS LUW • COPY • BACKUP • DSNTIAUL/Fast Unload • EXPORT • LOAD • LOAD/IMPORT • RECOVER • RESTORE • REORG (TS, IX) • REORG (Table) • RUNSTATS o REORGCHK o “Real-Time Statistics” (V7) • RUNSTATS • QUIESCE • QUIESCE • MERGECOPY • Set Integrity • CHECK DATA Figure 23: Utility comparison 21
  • 26. Breaking the DB2 Platform Barrier Backup and Recovery Backups These are the components that comprise a complete backup for the different environments. Z/OS LUW • Tablespace • Database • Index • Tablespace • Components • Components o Full Copy o Backup Image o Incremental Copy o Incremental Copy (7.2) o Copy to Copy (V7) o Backup History File o Active/Archive Logs o Active Logs* o BSDS o Archive Logs* o SYSLGRNX *Depends on how logging is defined for DB Figure 24: Components of backups Recovery Info Z/OS LUW Bootstrap Dataset (BSDS) Recovery History File • Inventory of all active and archive log data • Updated: sets o Backup of full DB or TS • Range of log records in each log file o Restore of full DB or TS • Restart information o Load of a table o Size/Thresholds of Bufferpools and o Quiesce TS Hiperpools • Contains: o Part of DB which was copied o When DB was copied o Location of the copy o Time of last restore Figure 25: BSDS vs. Recovery History File 22
  • 27. White Paper Logging z/OS Logs are defined at the subsystem level and are global for all objects. All update activity is logged in the current active log. When the active log is full, it is automatically archived. Dual Logging provides a redundant backup of log files in case of media failure. All activity is logged to current active log. Once the active log is full it is auto-archived to tape and the next active log in sequence is used. Dual Logging permits two of these processes to occur in parallel for failure protection. LUW There is no concept of auto-archiving. When a primary log file is filled, a secondary file will be allocated. This will continue until no more secondary logs are available. Z/OS LUW • Logs apply to entire subsystem Defined at database o Active • Circular o Archive o No roll-forward recovery • Active logs are automatically archived when full • Archival • Dual-Logging o Fully recoverable o Similar to OS/390 Three log files Active Online Archived Offline Archived • On Demand Archiving o Close and archive an active log at any time • Dual-Logging Figure 26: Logging overview 23
  • 28. Breaking the DB2 Platform Barrier Circular Logging in LUW Supports both crash- and version-type recoveries. Primary log files are allocated when the database is created. Secondary log files are allocated as needed. • Automatically de-allocated when no longer needed • Good for periodic large units of work • Non-recoverable databases • Log files are reused • Uses active logs only • Secondary used for overflow • Roll-forward recovery not possible • Default method for new DBs 1 "n" 2 1 S Primary e c o 3 n d a "n" r y Figure 27: Circular logging process 24
  • 29. White Paper Archival Logging Log files are not reused—it’s a roll forward recovery. Online Archival - 12 Contains information for committed and externalized transactions. Stored in the active log 13 subdirectory. 14 15 Active - Offline Archival Contains information Files moved from active For non-committed or Log subdirectory. Non-externalized 16 Usually offline media. Transactions. Figure 28: Archival logging detail Active (15, 16) — Contains information related to units of work that have not yet been committed or rolled back. They also contain information for transactions that have committed, but whose changes have not been written to disk. Online archive (14) — Contains information related to completed transactions that no longer require crash recovery protection. These are called online because they reside in the same subdirectory as the active logs. Offline archive (12, 13) — Log files that have been removed from the active log subdirectory. The files must be moved manually. There is no auto-archiving in UDB. Recovery There are three basic recovery options available for either z/OS or LUW: Z/OS L,U,W • Crash • Crash o DB2 restart o Uses logs to recover from power interrupts or • Roll-Forward application ABENDS o IC plus log apply • Version o LOGONLY o Image copy (TOCOPY) • Point in Time • Roll-Forward o IC only (TOCOPY) o Image copy plus log apply o To RBA Figure 29: Types of recovery 25
  • 30. Breaking the DB2 Platform Barrier Runstats The concept of statistics collection for proper optimization is essentially the same across the platforms. z/OS The Real-Time statistics facility in z/OS is a stored procedure (DSNACCOR) that gives near real-time feedback of space utilization. This feature requires the real- time stats database to be set up DSNRTSDB. LUW In DB2 LUW Version 9.1, automatic statistics collection has been added. The DB2 server collects statistical information about your data in a background process when required. Only optimizer-related stats are collected in order to minimize performance overhead. Reorganizing Data The biggest difference between reorganization in z/OS vs. LUW is that in z/OS you reorg either a tablespace or an index. In LUW, you reorganize the table. DB2 for LUW v9.1 added the ability to automatically reorg tables and indexes based on predefined thresholds. Z/OS LUW • Tablespace • Table o Log Yes/No • Index (v8) o Unload Pause • REORGCHK o Shrlevel o Determines when reorg is required • Index • Online • Online • Automated reorg (V9.1) o SHRLEVEL CHANGE Figure 30: Reorg parameters 26
  • 31. White Paper Unloading Data IBM introduced their new high-performance Unload utility in version 7. This utility allows you to unload data from either a table or an image copy. Z/OS LUW • DSNTIAUL • EXPORT o IBM sample program o Accessed via Control Center or CLP • REORG UNLOAD PAUSE o Rename columns • UNLOAD Utility o Multiple output formats o Table o Image Copy Figure 31: Unload options Loading Data Z/OS LUW • Load Utility • Load o Resume/Replace o Insert/Replace o Log YES/NO o RUNSTATS o Runstats/Copy o Significantly faster than import o Sophisticated SQL processing o Good for large amounts of data • ONLINE o Online) o SHRLEVEL CHANGE • Import o Can dynamically create table o Insert process Update Replace o Good for small amounts of data Figure 32: Data load options 27
  • 32. Breaking the DB2 Platform Barrier CONCLUSION DBAs are more commonly being asked to manage relational databases regardless of the vendor or the operating system on which the database resides. Having a solid foundation in relational database principles is absolutely necessary, but not enough in a heterogeneous database environment. It is also necessary to be able to work with the nuances and varying processes required by the individual database type. There is no substitute for experience and knowledge, but having a tool that standardizes and simplifies these processes will maximize the efficiency of a DBA staff and greatly help to reduce problems that can result in application downtime. 28
  • 33. White Paper ABOUT THE AUTHOR Jim Wankowski is currently the DB2 Technology Specialist at Quest Software. Jim has more than 20 years of development and DBA experience with DB2. Jim participated in the original beta program for DB2 in 1984. Jim is well known in the DB2 community. He has written articles for DB2 Magazine, z/Journal, Database Trends & Applications, and regularly presents at IDUG conferences, regional DB2 user groups, and vendor seminars worldwide. 29
  • 34. Breaking the DB2 Platform Barrier ABOUT QUEST SOFTWARE, INC. Quest Software, Inc. delivers innovative products that help organizations get more performance and productivity from their applications, databases and Windows infrastructure. Through a deep expertise in IT operations and a continued focus on what works best, Quest helps more than 18,000 customers worldwide meet higher expectations for enterprise IT. Quest Software can be found in offices around the globe and at www.quest.com. Contacting Quest Software Phone: 949.754.8000 (United States and Canada) Email: info@quest.com Mail: Quest Software, Inc. World Headquarters 5 Polaris Way Aliso Viejo, CA 92656 USA Web site www.quest.com Please refer to our Web site for regional and international office information. Contacting Quest Support Quest Support is available to customers who have a trial version of a Quest product or who have purchased a commercial version and have a valid maintenance contract. Quest Support provides around the clock coverage with SupportLink, our web self-service. Visit SupportLink at http://support.quest.com From SupportLink, you can do the following: • Quickly find thousands of solutions (Knowledgebase articles/documents). • Download patches and upgrades. • Seek help from a Support engineer. • Log and update your case, and check its status. View the Global Support Guide for a detailed explanation of support programs, online services, contact information, and policy and procedures. The guide is available at: http://support.quest.com/pdfs/Global Support Guide.pdf 30