IBM® DB2® Universal Database™ Enterprise - Extended Edition
                                   for AIX® and HACMP/ES
     ...
This document contains proprietary information of IBM. It is provided under a license
agreement and is protected by copyri...
Contents

                  Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...
iv   IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
Figures

                  1.   HACMP running on the initial target configuration . . . . . . . . . . . . . . . . . . . .2...
vi   IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
Tables

                  1.   Volume groups and filesystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....
viii   IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
Abstract

                  IBM® DB2® Universal Database™ (UDB) is the industry’s first multimedia,
                  Web-...
About the authors
               Andy Beaton has 14 years of database experience, and is a certified DB2
               UD...
Chapter 1. Target configuration
                  The HACMP configuration described in this document involves a six-partit...
Cluster name: CL1314

                    Cluster                                                   Cluster
              ...
clnode13 node, if necessary. Suppose someone unplugs the clnode13 node.
The clnode14 node detects this event and begins ta...
4   IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
Chapter 2. Disk and logical volume manager (LVM) setup
                  Because shared disks are such an integral part of...
Once the “dummy” disk is defined and the external shared disks are
               configured, you can list the disks using...
and fill in the fields. Following is an example:

                                             Add a Volume Group



     ...
# logform /dev/jfslog15
               logform: destroy /dev/jfslog15 (y)? y

               To verify that the JFSLog has...
Alternatively, issue the following command to create the LV for the
/db1ha/svtha1/NODE0150 file system:
# mklv -y’halv150’...
Volume       Logical Volume and             Cluster   Primary    Sharing VG
                group        filesystem (mount...
Note:
              You need only include one physical volume of a volume group.


            Alternatively, issue the fo...
12   IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
Chapter 3. NFS configuration
                  One of the things that differentiates the EEE setup from an EE setup is tha...
swserv13   root
                swserv14   root
                swserv15   root
                swserv16   root
          ...
lo0    16896   link#1                              64126      0      64170      0     0
lo0    16896   127        loopback...
16 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
Chapter 4. User setup and DB2 installation
                  Now that the components of the LVM are set up, DB2 can be ins...
4. Test db2start and file system setup.
                  Since db2icrt only adds one line to the $HOME/sqllib/db2nodes.cf...
In our example, we copied this directory to a special directory on the
   control workstation of the SP complex. Our examp...
and the user tries to log on, the new password will not work. Therefore,
                     the administrator must ensur...
Chapter 5. HACMP setup
                  This chapter assumes that HACMP/ES 4.3 has been installed on the two
            ...
css0 65520 9.21.77        b_sw_015              546472       0   545291    0    0
               css0 65520 9.21.77       ...
Alternatively, enter the following command:
            /usr/sbin/cluster/utilities/claddclstr -i 1516 -n cl1516


5.2 Def...
Repeat this for the three adapters for the bf01n016 node. This must be done
               from the bf01n015 node and will...
NODE bf01n016:
                        This node has 2 service interface(s):

                        Service Interface sw...
Emulate will check your defined parameters and give a result based on the
               correctness of those parameters. ...
Cluster   Server name    start script                 stop script

 cl1516    as1516         /usr/bin/rc.db2pe.15.start   ...
The syntax of rc.db2pe for DB2 database partitions is:
               rc.db2pe <instance> <partition for the primary node>...
Configure a Resource Group

              Resource Group Name                           rg1516
              Node Relation...
< select the Resource Group >

                                                      COMMAND STATUS

                 Comm...
/usr/sbin/cluster/diag/clconfig -v ’-tr’ -m ’All’

Your DB2 UDB and HACMP/ES setup is complete.
   Note:
 Be sure to start...
32   IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
Chapter 6. Troubleshooting
                  This chapter documents some hints and tips to address some situations that
  ...
6.2 SQL6031 returned when issuing db2 “? SQL6031” command
               This error could indicate a problem with the entr...
in the sqllib directory is out of sequence.

(7) The nodenum value at line “<line>” of the db2nodes.cfg file
in the sqllib...
IBM
IBM
IBM
IBM
IBM
IBM
IBM
IBM
IBM
IBM
IBM
IBM
IBM
IBM
IBM
IBM
IBM
Upcoming SlideShare
Loading in...5
×

IBM

643

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
643
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
18
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

IBM

  1. 1. IBM® DB2® Universal Database™ Enterprise - Extended Edition for AIX® and HACMP/ES (TR-74.174) June 22, 2001 Gene Thomas DB2 UDB System Verification Test Andy Beaton DB2 UDB System Verification Test Enzo Cialini DB2 UDB System Verification Test Darrin Woodard DB2 UDB System Verification Test
  2. 2. This document contains proprietary information of IBM. It is provided under a license agreement and is protected by copyright law. The information contained is this publication does not include any product warranties, and any statements provided in this document should not be interpreted as such. © Copyright International Business Machines Corporation 2001. All rights reserved. Note to U.S Government Users – Documentation related to restricted rights – Use, duplication or disclosure is subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp.
  3. 3. Contents Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix ITIRC keywords. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix About the authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x Chapter 1. Target configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Chapter 2. Disk and logical volume manager (LVM) setup . . . . . . . . . . 5 2.1 Setting up the disk drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Chapter 3. NFS configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1 Set up TCP/IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2 NFS export the /homehalocal file system . . . . . . . . . . . . . . . . . . . . . 14 3.3 Mount the /homehalocal file system . . . . . . . . . . . . . . . . . . . . . . . . . 14 Chapter 4. User setup and DB2 installation . . . . . . . . . . . . . . . . . . . . 17 Chapter 5. HACMP setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.1 Define the cluster ID and name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 5.2 Define the cluster nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.3 Add the adapters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.4 Show cluster topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.5 Synchronize cluster topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.6 Add a resource group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.7 Add an application server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.8 Configure resources for the resource group . . . . . . . . . . . . . . . . . . . 28 5.9 Synchronize cluster resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.10 Show resource information by resource group. . . . . . . . . . . . . . . . . 29 5.11 Verify cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Chapter 6. Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.1 SQL6048 on db2start command . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.2 SQL6031 returned when issuing db2 “? SQL6031” command . . . . . . 34 6.3 Ethernet IP label instead of the switch IP label in db2nodes.cfg file. . 36 6.4 SQL1032 when using Autoloader after a failback . . . . . . . . . . . . . . . 37 6.5 SQL6072 when using the switch HACMP service IP label . . . . . . . . . 37 6.6 SQL6031 RC=12, not enough port in /etc/serivces . . . . . . . . . . . . . . 38 6.7 SQL6030 RC=15, no port 0 defined in db2nodes.cfg file . . . . . . . . . . 38 6.8 HACMP Returns config to long, stopping the catalog node . . . . . . . . 40 6.9 db2_all with the “;” option loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Chapter 7. Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 7.1 Test environment and tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 7.2 Points of failure for test consideration . . . . . . . . . . . . . . . . . . . . . . . . 45 Chapter 8. Additional information . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Appendix A. Trademarks and service marks . . . . . . . . . . . . . . . . . . . . . .51 © Copyright IBM Corp. 2001 iii
  4. 4. iv IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  5. 5. Figures 1. HACMP running on the initial target configuration . . . . . . . . . . . . . . . . . . . .2 2. HACMP after failure of node13. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 3. rc.db2pe modification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .37 4. Sample script for testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44 5. Sample clstat screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .45 © Copyright IBM Corp. 2001 v
  6. 6. vi IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  7. 7. Tables 1. Volume groups and filesystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 2. Volume group and filesystem relationship for nodes . . . . . . . . . . . . . . . . . .9 3. Resource group to node relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26 4. Application server scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26 5. Resource group configuration information . . . . . . . . . . . . . . . . . . . . . . . . .28 © Copyright IBM Corp. 2001 vii
  8. 8. viii IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  9. 9. Abstract IBM® DB2® Universal Database™ (UDB) is the industry’s first multimedia, Web-ready relational database management system, powerful enough to meet the demands of large corporations and flexible enough to serve medium-sized and small e-businesses. DB2 Universal Database combines integrated power for business intelligence, content management, and e-business with industry-leading performance and reliability. This combination coupled with High Availability Clustered Multi-Processing (HACMP), strengthens the solution by providing a highly available computing environment. HACMP for AIX® provides a highly available computing environment. This facilitates the automatic switching of users, applications, and data from one system to another in the cluster after a hardware or software failure. A complete High Availability (HA) setup includes many parts, one of which is the HACMP software. Other parts of an HA solution come from AIX and the logical volume manager (LVM). As well as tangible items such as hardware and software, a good HA solution includes planning, design, customizing, and change control. An HA solution reduces the amount of time that an application is unavailable by removing single points of failure. This document takes you through a target configuration setup using DB2 UDB Enterprise - Extended Edition (EEE) V7.2 and HACMP/ES 4.3. ITIRC keywords • HA • DB2 • UDB • AIX • HACMP • HACMP/ES • Availability © Copyright IBM Corp. 2001 ix
  10. 10. About the authors Andy Beaton has 14 years of database experience, and is a certified DB2 UDB Database Administrator and Advanced Technical Expert in both DB2 for Clusters and DB2 for DRDA. He works at the IBM SWS Toronto Laboratory in the DB2 UDB System Verification Test department. Andy is responsible for testing DB2 UDB in a variety of configurations, including AIX EE, AIX EEE and HACMP. Andy is living proof that a degree in Astronomy is no impediment to having an interesting job. Enzo Cialini has been working with Database Technology at the IBM SWS Toronto Laboratory for over nine years, is certified in DB2 UDB Database Administration and DB2 UDB Database Application Development, is an Advanced Technical Expert in DB2 for DRDA, and an Advanced Technical Expert in DB2 for Clusters. He is currently responsible for managing the DB2 UDB System Verification Test Department, with a focus on High Availability, and has been involved with DB2 and HACMP for many years. His experience ranges from implementing and supporting numerous installations to consulting. Gene Thomas has been with IBM for over 28 years. For the last six years, he has worked as an AIX Systems Programmer/Administrator supporting the DB2 UDB Function in the IBM SWS Toronto Laboratory. He was instrumental in bringing HACMP into the Laboratory. Gene has had formal training from the company that created and supports HACMP, Clam and Associates (now known as Availant), located in Cambridge, Massachusetts. His primary tasks in support of HACMP in the Laboratory have been: setting up the hardware configuration for the HACMP cluster; installation and setup of the AIX operating system; and installation and setup of the HACMP cluster for DB2 UDB testing. He has recently joined the UDB System Verification Test Department and is doing work with HACMP and DB2 UDB. Darrin Woodard has been with IBM for over 10 years, is certified in DB2 UDB Database Administration and DB2 UDB Database Application Development, is an Advanced Technical Expert in DB2 for Clusters, an IBM Certified Specialist in AIX System Support, and an IBM Certified Specialist in AIX HACMP. For the last five years, he has worked in the DB2 UDB System Verification Test Department. Here Darrin was responsible for testing DB2 UDB EE and DB2 UDB EEE in an HACMP environment. The tests have ranged from a 2-node cluster up to a 340-node DB2 UDB EEE database running on 340 physical SP nodes with over 160 clusters. During the previous five years, Darrin was an AIX Systems Administrator supporting the DB2 UDB Function in the IBM SWS Toronto Laboratory, where he was responsible for the installation and setup of the AIX operating system. This included the RS/6000 and the Scalable Power parallel system (SP). x IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  11. 11. Chapter 1. Target configuration The HACMP configuration described in this document involves a six-partition DB2 database and two clusters, each with mutual takeover and cascading resource groups. It uses HACMP/ES 4.3 and DB2 UDB EEE V7.2 running on AIX 4.3.3. The clusters being defined are named cl1314 and cl1516, with cluster IDs of 1314 and 1516 respectively. We arbitrarily selected these numbers because we are using SP nodes 13,14,15 and 16. The cl1314 cluster has two nodes (bf01n013 and bf01n014); two cluster node names (clnode13 and clnode14); two resource groups (rg1314 and rg1413) and two application servers (as1314 and as1413). The cl1516 cluster has two nodes (bf01n015 and bf01n016); two cluster node names (clnode15 and clnode16); two resource groups (rg1516 and rg1615); and two application servers (as1516 and as1615). Each of these nodes will have one SP switch and one ethernet adapter. The nodes within a cluster will have a shared external disk. The cl1314 cluster will have access to two volume groups, havg1314 and havg1413. The cl1516 cluster will have access to two volume groups, havg1516 and havg1615. Table 1. Volume groups and filesystems Cluster Resource group Volume group File system cl1314 rg1314 havg1314 /homehalocal /db1ha/svtha1/NODE0130 /db1ha/svtha1/NODE0131 rg1413 havg1413 /db1ha/svtha1/NODE0140 cl1516 rg1516 havg1516 /db1ha/svtha1/NODE0150 rg1615 havg1615 /db1ha/svtha1/NODE0160 /db1ha/svtha1/NODE0161 In the initial target configuration the db2nodes.cfg will have the following entries: 130 b_sw_013 0 b_sw_013 131 b_sw_013 1 b_sw_013 140 b_sw_014 0 b_sw_014 150 b_sw_015 0 b_sw_015 160 b_sw_016 0 b_sw_016 161 b_sw_016 1 b_sw_016 The initial target configuration is illustrated in Figure 1 on page 2. © Copyright IBM Corp. 2001 1
  12. 12. Cluster name: CL1314 Cluster Cluster node: CLNODE13 node: CLNODE14 Disk drive NFS home Volume group havg1314 server active on CLNODE13 DB2 DB2 install install image image Disk drive Volume group havg1413 active on CLNODE14 b_sw_013 bf01n013 b_sw_014 bf01n014 switch ethernet b_sw_015 bf01n015 b_sw_016 bf01n016 Cluster name: CL1516 Cluster Cluster node: CLNODE15 node: CLNODE16 Disk drive Volume group havg1516 active on CLNODE15 DB2 DB2 install install image image Disk drive Volume group havg1615 active on CLNODE16 Figure 1. HACMP running on the initial target configuration If one of the two nodes within the cluster (for example, cl1314) has a failure, the other node in the cluster will acquire the resources that are defined in the resource group. The application server is then started on the node that has taken over the resource group. In our case, the application server that is started is DB2 UDB EEE V7.2 for the instance svtha1. In our example of a failover, DB2 UDB EEE V7.2 is running on a node clnode13; it has an NFS mounted home directory and a database located on the /db1ha/svtha1/NODE0130 and /db1ha/svtha1/NODE131 file systems. This file system is in a volume group called havg1314. The clnode14 node is currently running DB2 for partition 140 and is ready to take over from the 2 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  13. 13. clnode13 node, if necessary. Suppose someone unplugs the clnode13 node. The clnode14 node detects this event and begins taking over resources from the clnode13 node. These resources include the havg1314 volume group, the file system, and the hostname swserv13. Once the resources are available on the clnode14 node, the application server start script runs. The instance ID can log on to the clnode14 node (now with an additional hostname swserv13) and can connect to the database. Remote clients can also connect to the database, because the hostname swserv13 is now located on the clnode14 node. This example is illustrated in Figure 2. Cluster name: CL1314 Cluster Cluster node: CLNODE13 node: CLNODE14 Disk drive NFS home Volume group havg1314 server active on CLNODE14 DB2 DB2 install install image image Disk drive Volume group havg1413 active on CLNODE14 b_sw_013 bf01n013 b_sw_014 bf01n014 switch ethernet b_sw_015 bf01n015 b_sw_016 bf01n016 Cluster name: CL1516 Cluster Cluster node: CLNODE15 node: CLNODE16 Disk drive Volume group havg1516 active on CLNODE15 DB2 DB2 install install image image Disk drive Volume group havg1615 active on CLNODE16 Figure 2. HACMP after failure of node13 Chapter 1. Target configuration 3
  14. 14. 4 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  15. 15. Chapter 2. Disk and logical volume manager (LVM) setup Because shared disks are such an integral part of HACMP setup, this chapter details the steps needed to set up the disks, as well as the various parts of the LVM. All commands described in this chapter must be invoked by the root user. VG Volume group LV Logical volume JFS/File system/FS Journaled file system hdisk# A name for a disk drive or a RAID JFSlog A log that maintains a consistent JFS clnode13, clnode14, clnode15 and clnode16 Cluster node names for the four nodes 2.1 Setting up the disk drives We are going to show how to set up the havg1516 volume group and its components. These steps must be repeated for the other three volume groups. Proceed through the following sections in order to set up shared disk drives and the logical volume manager. 2.1.1 Set up the disk drives To achieve consistent hdisk numbering and a more readable configuration, it is sometimes necessary to define an additional hdisk. If the number of hdisks is not the same on both nodes, define one or more “dummy” disks on the node that has fewer disks, until the number of hdisks is equal. Then attach and configure the external shared disk on both nodes. For example, on the bf01n015 node, there are only three internal disks (hdisk0,1,2) currently defined, whereas on the bf01n016 node, there are four (hdisk0,1,2,3) currently defined. To get the hdisk numbers to match for the external shared disk to be attached, a “dummy” disk must be defined on the bf01n015 node. The external shared disk will be labelled hdisk4, hdisk5, and hdisk6. To define a “dummy” disk on the bf01n015 node, run the following command on the bf01n015 node: # mkdev -c disk -t ’400mb’ -s ’scsi’ -p ’scsi1’ -w ’0,6’ -d Note: To select a disk type, use the lsdev -Pc disk command to list the disk types that are in the Predefined Devices object class. In this example, 400mb was one of the listed types. © Copyright IBM Corp. 2001 5
  16. 16. Once the “dummy” disk is defined and the external shared disks are configured, you can list the disks using the lsdev command. From the bf01n015 node: # lsdev -Cc disk hdisk0 Available 00-07-00-0,0 4.5 GB 16 Bit SCSI Disk Drive | internal hdisk1 Available 00-07-00-2,0 2.0 GB 16 Bit SCSI Disk Drive | internal hdisk2 Available 00-08-00-2,0 2.0 GB SCSI Disk Drive | internal hdisk3 Defined 00-01-00-0,6 400 MB SCSI Disk Drive | dummy hdisk4 Available 00-01-00-1,0 7135 Disk Array Device | external hdisk5 Available 00-01-00-1,1 7135 Disk Array Device | external hdisk6 Available 00-01-00-1,2 7135 Disk Array Device | external From the bf01n016 node: # lsdev -Cc disk hdisk0 Available 00-08-00-0,0 4.5 GB 16 Bit SCSI Disk Drive | internal hdisk1 Available 00-08-00-2,0 2.0 GB 16 Bit SCSI Disk Drive | internal hdisk2 Available 00-08-00-3,0 4.5 GB 16 Bit SCSI Disk Drive | internal hdisk3 Available 00-08-00-4,0 4.5 GB 16 Bit SCSI Disk Drive | internal hdisk4 Available 00-01-00-1,0 7135 Disk Array Device | external hdisk5 Available 00-01-00-1,1 7135 Disk Array Device | external hdisk6 Available 00-01-00-1,2 7135 Disk Array Device | external 2.1.2 Create the volume group (VG) Create the VG on the bf01n015 node. The VG must have a unique name and major number for all nodes in the cluster Hint: Create all of the VGs and the corresponding LVs, JFSs, and JFSLogs on one of the two nodes, and import them to the second node. This will produce unique names for all. Check the major numbers on the two nodes. From the bf01n015 node: # /usr/sbin/lvlstmajor 43... From the bf01n016 node: # /usr/sbin/lvlstmajor 45,...77,79... After analyzing this output, you can safely pick a major number of 45 or greater, but not 78, because those major numbers are free on the nodes. For this setup, 67 will be used as the major number. To add a volume group, issue the following command: # smit vg >Add a Volume Group 6 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  17. 17. and fill in the fields. Following is an example: Add a Volume Group VOLUME GROUP name [havg1516] Physical partition SIZE in megabytes 32 PHYSICAL VOLUME names [hdisk4 hdisk5 hdisk6] Activate volume group AUTOMATICALLY no at system restart? Volume group MAJOR NUMBER [67] Create VG Concurrent Capable? no Auto-varyon in Concurrent Mode? no Alternatively, issue the following command to add a volume group: # mkvg -f -y’havg1516’ -s’32’ ’-n’ -V’67’ hdisk4 hdisk5 hdisk6 Activate the newly created volume group by issuing the following command: # varyonvg havg1516 2.1.3 Create a JFSLog Create the JFSLog with a unique name on the new VG. When creating the first file system on a new VG, AIX will automatically create a JFSLog, with the name of loglv00, loglv01, and so on for each new JFSLog on the machine. By default, AIX creates only one JFSLog per VG. Because a unique name is needed for the JFSLog, it is best to define the JFSLog with the mklv command before creating the first file system. Running the following command on both nodes will list the LV names that are already in use: # lsvg -l $(lsvg) rootvg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT hd5 boot 1 1 1 closed/syncd N/A hd6 paging 65 65 1 open/syncd N/A hd8 jfslog 1 1 1 open/syncd N/A hd4 jfs 2 2 1 open/syncd / hd2 jfs 196 196 1 open/syncd /usr hd9var jfs 10 10 1 open/syncd /var hd3 jfs 10 10 1 open/syncd /tmp lv03 jfs 3 3 1 open/syncd /ryan homevg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT homelv jfs 300 300 2 open/syncd /home paging00 paging 64 64 1 open/syncd N/A loglv00 jfslog 1 1 1 open/syncd N/A havg1516: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT To create a JFSLog with the name jfslog15 in the VG havg1516, issue the following command: # mklv -t jfslog -y jfslog15 havg1516 1 To format the JFSLog, issue the following command, and select “y” when asked whether to destroy the LV: Chapter 2. Disk and logical volume manager (LVM) setup 7
  18. 18. # logform /dev/jfslog15 logform: destroy /dev/jfslog15 (y)? y To verify that the JFSLog has been created, issue the following command: # lsvg -l havg1516 havg1516: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT jfslog15 jfslog 1 1 1 closed/syncd N/A 2.1.4 Create the LVs and the JFS Create any LVs and JFSs that are needed, and ensure that they have unique names and are not currently defined on any node. Set the file systems so that they are not mounted on restart. To verify the current LV and JFS names, run the following command on both nodes: # lsvg -l $(lsvg) rootvg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT hd5 boot 1 1 1 closed/syncd N/A hd6 paging 65 65 1 open/syncd N/A hd8 jfslog 1 1 1 open/syncd N/A hd4 jfs 2 2 1 open/syncd / hd2 jfs 196 196 1 open/syncd /usr hd9var jfs 10 10 1 open/syncd /var hd3 jfs 10 10 1 open/syncd /tmp lv03 jfs 3 3 1 open/syncd /ryan homevg: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT homelv jfs 300 300 2 open/syncd /home paging00 paging 64 64 1 open/syncd N/A loglv00 jfslog 1 1 1 open/syncd N/A havg1516: LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT jfslog15 jfslog 1 1 1 closed/syncd N/A By analyzing the LV NAME column, we can safely select halv150 as the new LV name, because it is not currently being used. Be sure to check the other node in the cluster (that is, clnode16). To create the LV for the /db1ha/svtha1/NODE0150 file system, issue the following command: # smit lv > Add a Logical Volume > then select: VOLUME GROUP name [havg1516] and fill in the fields. Following is an example. Some of the default entries from the window have been removed to show what parameters have been entered. Add a Logical Volume Logical volume NAME [halv150] VOLUME GROUP name havg1516 Number of LOGICAL PARTITIONS [1] PHYSICAL VOLUME names [hdisk4 hdisk5 hdisk6] RANGE of physical volumes maximum 8 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  19. 19. Alternatively, issue the following command to create the LV for the /db1ha/svtha1/NODE0150 file system: # mklv -y’halv150’ -e’x’ havg1516 1 hdisk4 hdisk5 hdisk6 In this setup, we are not going to mirror the logical volumes, because we are using RAID. If RAID is not being used, it is highly recommended that the LV be mirrored, and that the mirror be on separate physical volumes on a separate bus or path. This will remove the disk drive and the disk adapter as single points of failure. Once the LV has been created, we can create a file system associated with this LV. In this example, we will create a Large File Enabled Journaled File System. We could also create a Standard Journaled File System or Compressed File System. To add a journaled file system on a previously defined logical volume, issue the following command: # smit jfs > Add a Journaled File System on a Previously Defined Logical Volume > Add a Large File Enabled Journaled File System and fill in the fields. Following is an example: Add a Large File Enabled Journaled File System LOGICAL VOLUME name halv150 MOUNT POINT [/db1ha/svtha1/NODE0150] Mount AUTOMATICALLY at system restart? no PERMISSIONS read/write Mount OPTIONS [] Start Disk Accounting? no Number of bytes per inode 4096 Allocation Group Size 64 After creating any file system, be sure to increase its size to a level that is appropriate for the application. To increase the size of a file system, use the smit chfs command. In this example, we need to repeat these steps to create the other volume groups, jfslogs, logical volumes and file systems on the respective nodes, until we have the setup shown in Table 2. Table 2. Volume group and filesystem relationship for nodes Volume Logical Volume and Cluster Primary Sharing VG group filesystem (mount point) node with havg1314 hahomelv cl1314 bf01n013 bf01n014 /homehalocal halv130 /db1ha/svtha1/NODE0130 halv131 /db1ha/svtha1/NODE0131 havg1413 halv140 bf01n014 bf01n013 /db1ha/svtha1/NODE0140 Chapter 2. Disk and logical volume manager (LVM) setup 9
  20. 20. Volume Logical Volume and Cluster Primary Sharing VG group filesystem (mount point) node with havg1516 halv150 cl1516 bf01n015 bf01n016 /db1ha/svtha1/NODE0150 havg1615 halv160 bf01n016 bf01n015 /db1ha/svtha1/NODE0160 halv161 /db1ha/svtha1/NODE0161 After a file system is created, it is not automatically mounted. To mount the file system, enter the following mount command: # mount /db1ha/svtha1/NODE0150 For the DB2 instance to be able to write to the file systems, we have to change the ownership. This must be done after the user IDs and groups are defined in User Setup and DB2 Installation. For example, use the following command: # chown svtha1.dbadmin1 /db1ha/svtha1/NODE0150 2.1.5 Unmount all of the file systems and deactivate the VG To do this, invoke the following commands: # unmount /db1ha/svtha1/NODE0150 # varyoffvg havg The volume group is deactivated on the bf01n015 node before it is activated on the bf01n016 node. 2.1.6 Import the VG to the secondary node Import the VG on the bf01n016 node with the same major number, and change the VG so that it is not activated on restart. When the VG is imported on the bf01n016 node, the file systems and logical volumes will be defined on the bf01n016 node. Because the major number for the VG is the same on the bf01n016 node, if we ever need to NFS export the file system, the failover will work. Because the VG is defined to not be activated automatically on reboot, it can be activated when HACMP starts. To import a volume group, issue the following command: # smit vg > Import a Volume Group and fill in the fields. Following is an example: Import a Volume Group VOLUME GROUP name [havg1516] PHYSICAL VOLUME name [hdisk4] Volume Group MAJOR NUMBER [67] Make this VG Concurrent Capable? no Make default varyon of VG Concurrent? no 10 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  21. 21. Note: You need only include one physical volume of a volume group. Alternatively, issue the following command to import a volume group: # importvg -y’havg1516’ -V’67’ hdisk4 then change the VG so that it is not activated on reboot: # smit vg > Set Characteristics of a Volume Group > Change a Volume Group > Then select: VOLUME GROUP name [havg1516] and fill in the fields. Following is an example: Change a Volume Group VOLUME GROUP name havg1516 Activate volume group AUTOMATICALLY no at system restart? A QUORUM of disks required to keep the volume yes group on-line? Convert this VG to Concurrent Capable? no Autovaryon VG in Concurrent Mode? no Alternatively, issue the following command to change a volume group: # chvg -a’n’ -Q’y’ -x’n’ havg1516 2.1.7 Move the active VG back to the primary node The VG is currently active on the bf01n016 node. To move the active VG to the bf01n015 node, run the following on the bf01n016 node: # varyoffvg havg1516 and then run the following on the bf01n015 node # varyonvg havg1516 # mount /db1ha/svtha1/NODE0150 Now repeat these steps for the other three volume groups until you have the setup listed in Table 2 on page 9. Chapter 2. Disk and logical volume manager (LVM) setup 11
  22. 22. 12 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  23. 23. Chapter 3. NFS configuration One of the things that differentiates the EEE setup from an EE setup is that the home directory for the instance is NFS mounted to all of the nodes in the EEE setup while on the EE setup, it is just a local file system. We need to configure NFS to have the home directory available to all nodes. 3.1 Set up TCP/IP Before the HACMP cluster is defined, the network adapters must be defined and AIX operating system files must be updated. Update or create the /etc/netsvc.conf file to include the following: hosts=local,bind The local entry refers to using the local /etc/hosts, and the bind entry refers to using the name server. This will force TCP/IP name resolution to check the local /etc/hosts file before going to the name server. Update /etc/hosts with the hostnames and IP addresses for all service, boot, and standby adapters. In our example, the following entries were added to the /etc/hosts file: 9.21.72.13 bf01n013 bf01n013.torolab.ibm.com # Ethernet 9.21.72.14 bf01n014 bf01n014.torolab.ibm.com # Ethernet 9.21.72.15 bf01n015 bf01n015.torolab.ibm.com # Ethernet 9.21.72.16 bf01n016 bf01n016.torolab.ibm.com # Ethernet 9.21.77.13 b_sw_013 b_sw_013.torolab.ibm.com # Base switch name 9.21.77.14 b_sw_014 b_sw_014.torolab.ibm.com # Base switch name 9.21.77.15 b_sw_015 b_sw_015.torolab.ibm.com # Base switch name 9.21.77.16 b_sw_016 b_sw_016.torolab.ibm.com # Base switch name 9.21.77.213 sw_boot_13 sw_boot_13.torolab.ibm.com # switch boot 9.21.77.214 sw_boot_14 sw_boot_14.torolab.ibm.com # switch boot 9.21.77.215 sw_boot_15 sw_boot_15.torolab.ibm.com # switch boot 9.21.77.216 sw_boot_16 sw_boot_16.torolab.ibm.com # switch boot 9.21.77.223 swserv13 swserv13.torolab.ibm.com # switch service 9.21.77.224 swserv14 swserv14.torolab.ibm.com # switch service 9.21.77.225 swserv15 swserv15.torolab.ibm.com # switch service 9.21.77.226 swserv16 swserv16.torolab.ibm.com # switch service Update /.rhosts to include the root user for the hostnames in the cluster: # cat /.rhosts bf01n013 root bf01n014 root bf01n015 root bf01n016 root b_sw_013 root b_sw_014 root b_sw_015 root b_sw_016 root sw_boot_13 root sw_boot_14 root sw_boot_15 root sw_boot_16 root © Copyright IBM Corp. 2001 13
  24. 24. swserv13 root swserv14 root swserv15 root swserv16 root Note: Permissions on ~/.rhosts must be no more liberal than -rw-r--r--. See “SQL6048 on db2start command” on page 33 or additional information. 3.2 NFS export the /homehalocal file system In the previous chapter we created a /homehalocal file system; before we can make it available to the other nodes we need to mount it locally. To mount it locally, use the mount command: # mount /homehalocal To make the file system available for the other nodes to NFS mount, we are required to export the file system. To export the file system use: # smit nfs ->Network File System (NFS) -> Add a Directory to Exports List Add a Directory to Exports List PATHNAME of Directory to Export [/homehalocal] MODE to export directory read-write HOSTS & NETGROUPS allowed client access [b_sw_013,swserv13, b_sw_014,swserv14 b_sw_015,swserv15,b_sw_016, swserv16] Anonymous UID [-2] HOSTS allowed root access b_sw_013,swserv13, b_sw_014,swserv14 b_sw_015,swserv15,b_sw_016, swserv16] HOSTNAME list. If exported read-mostly [] Use SECURE OPTION? no Public filesystem? no CHANGE export system restart or both both PATHNAME of alternate Exports file [] Alternatively, issue the exportfs command with the required parameters. 3.3 Mount the /homehalocal file system Before we mount the /homehalocal file system as an NFS mount to the other nodes, we need to setup an alias for the switch adapter. Since the switch adapter is limited to one per machine, we need to use these aliases. For the bf01n013 node, run the following and repeat this for the other nodes: # ifconfig css0 inet 9.21.77.213 netmask 255.255.255.0 alias up # ifconfig css0 inet 9.21.77.223 netmask 255.255.255.0 alias up where the IP address is that for the switch boot and switch service of the respective node. After we set up these aliases, they will be available and can be viewed using the netstat -i command: # netstat -i Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll 14 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  25. 25. lo0 16896 link#1 64126 0 64170 0 0 lo0 16896 127 loopback 64126 0 64170 0 0 lo0 16896 ::1 64126 0 64170 0 0 en0 1500 link#2 0.4.ac.49.3a.b7 106008 0 76463 0 0 en0 1500 9.21.72 bf01n013 106008 0 76463 0 0 css0 65520 link#3 84411 0 108278 0 0 css0 65520 9.21.77 b_sw_013 84411 0 108278 0 0 css0 65520 9.21.77 swserv13 84411 0 108278 0 0 css0 65520 9.21.77 sw_boot_13 84411 0 108278 0 0 We now need to login to each of the nodes and NFS mount the /homehalocal file system from the swserv13 host address to a local mount point. This mount point must match the entry in /etc/passwd for the instance’s home directory. In our case, we mount this at /homeha/svtha1. We even do this on the bf01n013 node.To setup the mounts use: # smit nfs ->Network File System (NFS) -> Add a file System for mounting Add a File System for Mounting Type or select values in entry fields. Press Enter AFTER making all desired changes. PATHNAME of mount point [/homeha/svtha1] PATHNAME of Remote Directory [/homehalocal] HOST where remote directory resides [swserv13] Mount type NAME [] Use SECURE mount option? no Remount file system now, both update /etc/filesystems or both? /etc/filesystems entry will mount the directory no on system RESTART. MODE for this NFS file system read-write ATTEMPT mount in background or foreground? background NUMBER of times to attempt mount [] Buffer SIZE for writes [] Buffer SIZE for writes [] .......(more info but not changed) ....... At this point, all nodes should have their local JFS file systems mounted. These are the /homehalocal file system on the bf01n013 node mounted as /homeha/svtha1 on all nodes and the /db1ha/svtha1/NODE0*** file systems on each node. We are now ready to install DB2 UDB and create the instance. Chapter 3. NFS configuration 15
  26. 26. 16 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  27. 27. Chapter 4. User setup and DB2 installation Now that the components of the LVM are set up, DB2 can be installed. The db2setup utility can be used to install and configure DB2. To better illustrate the configuration, we will define some of the components manually, and use the db2setup utility only to install the DB2 product and license. All commands described in this chapter must be invoked by the root user. Although the steps used to install DB2 are outlined below, for complete details, refer to the IBM DB2 Universal Database Enterprise - Extended Edition for UNIX Quick Beginnings book, and to the IBM DB2 Universal Database and DB2 Connect Installation and Configuration Supplement book. Before running db2icrt, make sure that the $HOME directory for the instance is available and the svtha1 id can write to the directory. Also make sure that a .profile file exists, as db2icrt will append to the file but will not create a new one. For this example we are using the svtha1 id that already exists on the SP complex. 1. Mount the CD-ROM. Use the crfs and mount commands to create and mount the CD-ROM: # crfs -v cdrfs -p ro -d’cd0’ -m’/cdrom’ An alternative is to use the smit fast path smit crfs. Then mount the CD-ROM using the mount command: # mount /cdrom If the SP node does not have a local CD-ROM, there are two options. The install image can be copied to disk for future installations or the CD-ROM can be NFS exported from the control workstation and mounted on the nodes. 2. Install DB2 and set up the license key. Once the CD-ROM is mounted, change to the corresponding directory and run ./db2setup on all four nodes. Follow the prompts to install DB2 UDB Enterprise - Extended Edition in the /usr/lpp/db2_07_01 directory. Do not use the db2setup utility to create any user IDs or instances. # cd /cdrom # ./db2setup Select DB2 UDB Enterprise - Extended Edition and install. 3. Create the DB2 instance. Run db2icrt to create the instance. This command only needs to be run on one of the four nodes because the $HOME for the instance is NFS mounted from one machine to the others. # cd /usr/lpp/db2_07_01/instance # ./db2icrt -u svtha1 svtha1 Note: If the home file system, /homehalocal, is not NFS exported with root access, then an error will occur. © Copyright IBM Corp. 2001 17
  28. 28. 4. Test db2start and file system setup. Since db2icrt only adds one line to the $HOME/sqllib/db2nodes.cfg file, we are required to update the file and add the other nodes, such that the db2nodes.cfg file would look like the following: 130 b_sw_013 0 b_sw_013 131 b_sw_013 1 b_sw_013 140 b_sw_014 0 b_sw_014 150 b_sw_015 0 b_sw_015 160 b_sw_016 0 b_sw_016 161 b_sw_016 1 b_sw_016 We must also create the $HOME/.rhosts file as db2start and other DB2 programs require it to run remote shells from one node to another. The .rhosts file would look like the following in our example: swserv13 svtha1 swserv14 svtha1 swserv15 svtha1 swserv16 svtha1 b_sw_013 svtha1 b_sw_014 svtha1 b_sw_015 svtha1 b_sw_016 svtha1 bf01n013 svtha1 bf01n014 svtha1 bf01n015 svtha1 bf01n016 svtha1 Note: Ensure the permissions on the $HOME/.rhosts files are correct. See “SQL6048 on db2start command” on page 33 for additional information. This is a good place to see if db2start will work. Log on as the db2inst1 instance and run the db2start command. To test the file system setup on each node, try creating a database. Be sure to create it on /db1ha and not in $HOME, which is the default. Use the following command to create the database: $ db2 create database testing on /db1ha Ensure all errors are corrected before proceeding to the next step; also be sure to stop DB2 using the db2stop command before proceeding to the next step. Note: If you get SQL code SQL6031 see “SQL6031 returned when issuing db2 “? SQL6031” command” on page 34 for additional information. 5. Install the DB2 HACMP scripts. Important: Review “HACMP ES Script Files” in Chapter 12 of the IBM DB2 Universal Database Administration Guide: Planning before attempting this section. DB2 UDB EEE supplies sample scripts for failover and user-defined events. These files are located in the /usr/lpp/db2_07_01/samples/hacmp/es directory 18 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  29. 29. In our example, we copied this directory to a special directory on the control workstation of the SP complex. Our example used /spdata/sys1/hacmp on the control workstation. The db2_inst_ha script is the tool used for installing scripts and events on multiple nodes in an HACMP EEE environment. It was used in the following manner for our examples: # cd /spdata/sys1/hacmp # db2_inst_ha svtha1 . 15-16 TESTDB This will install the scripts in the directory /usr/bin to all the nodes listed (in this case nodes 15 and 16) and prepare them to work with the database TESTDB. Note that the database name needs to be in upper case. The node selection can also be written in the form “15,16”, if you want to copy the files to specific nodes. When the application server is set up and the start and stop scripts are defined, they will call /usr/bin/rc.db2pe with a number of parameters. Note: The start and stop scripts that are called from the application server must exist on both nodes and have the same name. They do not need to have the same content if, for example, some customizing is needed. The db2_inst_ha script also copies over the HACMP/ES event stanzas. These events are defined in the db2_event_stanzas file. One example is the DB2_PROC_DOWN event, which will restart DB2 if it terminates for some reason. Note: DB2 will also restart if terminated by the db2stop or db2stop force commands. To stop DB2 without triggering a failure event, use the ha_db2stop command. For more information about HACMP/ES events, refer to “HACMP ES Event Monitoring and User-defined Events” in Chapter 33; High Availability Cluster Multi-processing, Enhanced Scalability (HACMP ES) for AIX, of the DB2 UDB Administration Guide. 6. Test a failover of the resources on bf01n015 to bf01n016. On the bf01n015 node: # unmount /db1ha/svtha1/NODE0150 # varyoffvg havg1516 On the bf01n016 node: # varyonvg havg1516 # mount /db1ha/svtha1/NODE0150 Note: These are the actual steps that the HACMP software takes during failover of the necessary file systems. Once DB2 HACMP is configured and set up, any changes made (for example, to the ID, groups, AIX system parameters, or the level of DB2 code) must be done on all nodes. Following are some examples: - The HACMP cluster is active on the bf01n015 node, and the password is changed on that node. When failover happens to the bf01n016 node, Chapter 4. User setup and DB2 installation 19
  30. 30. and the user tries to log on, the new password will not work. Therefore, the administrator must ensure that passwords are kept synchronized. - If the ulimit parameter on the bf01n015 node is changed, it must also be changed on the bf01n016 node. For example, suppose the file size is set to unlimited on the bf01n015 node. When a failover happens to the bf01n016 node, and the user tries to access a file that is greater than the default size of 1 GB, an error is returned. - If the AIX parameter maxuproc is changed on the bf01n015 node, it also must be changed on the bf01n016 node. When a failover occurs, and DB2 begins running on the bf01n016 node, it may reach the maxuproc value and return errors. - If non-DB2 software is installed on the bf01n015 node but not on the bf01n016 node, the software will not be available when a failover takes place. - Suppose that the database manager configuration parameter srvcname is used, and that /etc/services is updated on the bf01n015 node. If the bf01n016 node does not receive the same update and a failover occurs, the DB2 server will report warnings during db2start and will not start up the TCP/IP communications listeners and DB2 clients will report errors. 20 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  31. 31. Chapter 5. HACMP setup This chapter assumes that HACMP/ES 4.3 has been installed on the two cluster nodes but has not yet been configured. You should be familiar with the following terms: HACMP cluster A group of 2 to 32 IBM RS/6000 servers, configured to provide highly available services. If any resource fails, its function is taken over by another part of the cluster. Client Any system that utilizes the services provided by the cluster. Clients can be connected to the HACMP cluster by TCP/IP networks. The only requirement is that clients be able to access all of the nodes, so that in the event of a failure, the clients can access the nodes that have been taken over. Cluster node Any IBM RS/6000 system that has been configured to function as a highly available server. In the event that a cluster node fails, its resources will be taken over by another cluster node. The nodes that participate in the takeover may mount the failed system’s file systems, start up its applications, and even provide its IP & MAC address so that clients can reconnect to applications without reconfiguration. Resource An object that is protected by HACMP, and may include IP address, file systems, raw devices, or volume groups. A resource group is a set of resources that are grouped together to support a particular application. Application server A name given to the stop and start scripts for the application. In this paper, the application server is the start and stop scripts for DB2. Note: If HACMP is running and a user telnets to a node in the cluster, the connection may be to either one of the two machines that have been set up. There are two ways in which a cluster administrator can tell which machine is actually being used: • Use uname -a and record the unique serial number for each physical machine. • Use netstat -i to see which hostnames are defined on the cluster node. When using the netstat -i command to check the addresses in use with the system running in the default configuration, results similar to the following are returned: bf01n015:/homeha/svtha1 > netstat -i Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll lo0 16896 link#1 8673 0 8932 0 0 lo0 16896 127 loopback 8673 0 8932 0 0 lo0 16896 ::1 8673 0 8932 0 0 en0 1500 link#2 0.4.ac.49.35.b 613395 0 550200 0 0 en0 1500 9.21.72 bf01n015 613395 0 550200 0 0 css0 65520 link#3 546472 0 545291 0 0 © Copyright IBM Corp. 2001 21
  32. 32. css0 65520 9.21.77 b_sw_015 546472 0 545291 0 0 css0 65520 9.21.77 swserv15 546472 0 545291 0 0 After failover, telnet to the service address, swserv15, and run netstat -i. Your results will be similar to the following: bf01n016:/homeha/svtha1 > netstat -i Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll lo0 16896 link#1 13625 0 13861 0 0 lo0 16896 127 loopback 13625 0 13861 0 0 lo0 16896 ::1 13625 0 13861 0 0 en0 1500 link#2 0.4.ac.49.38.c3 607900 0 546129 0 0 en0 1500 9.21.72 bf01n016 607900 0 546129 0 0 css0 65520 link#3 533248 0 706144 0 0 css0 65520 9.21.77 b_sw_016 533248 0 706144 0 0 css0 65520 9.21.77 swserv16 533248 0 706144 0 0 css0 65520 9.21.77 swserv15 533248 0 706144 0 0 Note that the ethernet hostname and boot address have changed because we are actually on a different host, bf01n016. The service address remains the same, but has been taken over by bf01n016. Using the netstat -i command is a good way to check which machine the service address is currently assigned to. Two types of resource groups are used with HACMP: cascading and rotating resource groups. For more information on these resource groups, refer to the HACMP Concepts and Facilities guide. A cascading resource group is being used in this setup. Note: It is recommended to install the AIX fileset bos.compat.links before running HACMP/ES with DB2 UDB because the product uses symbolic links defined when this fileset is installed. To set up HACMP, proceed through the following sections; we only need to define the HACMP cluster on one node in a cluster and then synchronize it to the other node. Important: The following sections 5.1 to 5.11 are to be executed on nodes bf01n013 and bf01n015 only. Our examples only show information for bf01n015. 5.1 Define the cluster ID and name Enter the cluster ID and cluster name to define a cluster: # smit hacmp > Cluster Configuration > Cluster Topology > Configure Cluster > Add a Cluster Definition Add a Cluster Definition * Cluster ID [1516] * Cluster Name [cl1516] 22 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  33. 33. Alternatively, enter the following command: /usr/sbin/cluster/utilities/claddclstr -i 1516 -n cl1516 5.2 Define the cluster nodes Enter the node names of the nodes forming the cluster: # smit hacmp > Cluster Configuration > Cluster Topology > Configure Nodes > Add Cluster Nodes Add Cluster Nodes * Node Names [bf01n015 bf01n016] Alternatively, enter the following command: /usr/sbin/cluster/utilities/clnodename -a bf01n015 bf01n016 5.3 Add the adapters Enter the adapter attributes: # smit hacmp > Cluster Configuration > Cluster Topology > Configure Adapters > Add an Adapter * Adapter IP Label bf01n015 New Adapter IP Label [] Network Type [ether] Network Name [e1] Network Attribute public Adapter Function service Adapter Identifier [9.21.72.15] Adapter Hardware Address [] Node Name [bf01n015] * Adapter IP Label sw_boot_15 New Adapter IP Label [] Network Type [hps] Network Name [H1] Network Attribute private Adapter Function boot Adapter Identifier [9.21.77.215] Adapter Hardware Address [] Node Name [bf01n015] * Adapter IP Label swserv15 New Adapter IP Label [] Network Type [hps] Network Name [H1] Network Attribute private Adapter Function service Adapter Identifier [9.21.77.225] Adapter Hardware Address [] Node Name [bf01n015] Chapter 5. HACMP setup 23
  34. 34. Repeat this for the three adapters for the bf01n016 node. This must be done from the bf01n015 node and will later be synchronized to the bf01n016 node. When cataloging this DB2 node to a remote client, use the service address hostname, that is, swserv15. This will be the address that moves to the node that has DB2 running. 5.4 Show cluster topology Show cluster, node, network, and adapter topology: # smit hacmp > Cluster Configuration > Cluster Topology > Show Cluster Topology > Show Cluster Topology Command: OK stdout: yes stderr: no Before command completion, additional instructions may appear below. Cluster Description of Cluster cl1516 Cluster ID: 1516 Cluster Security Level Standard There were 2 networks defined: H1, e1 There are 2 nodes in this cluster NODE bf01n015: This node has 2 service interface(s): Service Interface swserv15: IP address: 9.21.77.225 Hardware Address: Network: H1 Attribute: private Service Interface swserv15 has a possible boot configuration: Boot (Alternate Service) Interface: sw_boot_15 IP Address: 9.21.77.215 Network: H1 Attribute: private Service Interface swserv15 has no standby interfaces Service Interface bf01n015: IP address: 9.21.72.15 Hardware Address: Network: e1 Attribute: public Service Interface bf01n015 has no standby interfaces 24 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  35. 35. NODE bf01n016: This node has 2 service interface(s): Service Interface swserv16: IP address: 9.21.77.226 Hardware Address: Network: H1 Attribute: private Service Interface swserv16 has a possible boot configuration: Boot (Alternate Service) Interface: sw_boot_16 IP Address: 9.21.77.216 Network: H1 Attribute: private Service Interface swserv16 has no standby interfaces Service Interface bf01n016: IP address: 9.21.72.16 Hardware Address: Network: e1 Attribute: public Service Interface bf01n016 has no standby interfaces Breakdown of network connections: Connections to network H1 Node bf01n015 is connected to network H1 by these interfaces: sw_boot_15 swserv15 Node bf01n016 is connected to network H1 by these interfaces: sw_boot_16 swserv16 Connections to network e1 Node bf01n015 is connected to network e1 by these interfaces: bf01n015 Node bf01n016 is connected to network e1 by these interfaces: bf01n016 Alternatively, enter the following command: # /usr/sbin/cluster/utilities/cllscf 5.5 Synchronize cluster topology Synchronize cluster topology information on all cluster nodes defined in the local topology database: # smit hacmp > Cluster Configuration > Cluster Topology > Synchronize Cluster Topology Synchronize Cluster Topology Ignore Cluster Verification Errors? [No] * Emulate or Actual? [Actual] Alternatively, enter the following command: # /usr/sbin/cluster/utilities/cldare -t Chapter 5. HACMP setup 25
  36. 36. Emulate will check your defined parameters and give a result based on the correctness of those parameters. It will not physically test your setup, and is not a substitute for actually testing the system. 5.6 Add a resource group The order of the nodes is important because a cascading resource group will only be activated on the first node listed when HACMP is started. Note: Listing the bf01n015 node first gives it a higher priority, and it will acquire resources when HACMP starts. The bf01n016 node will acquire resources only after the bf01n015 node fails. Use the entries in Table 3 as a reference for adding the resource groups. Table 3. Resource group to node relationship Cluster RG name Relationship Nodes cl1314 rg1314 cascading bf01n013, bf01n014 rg1413 cascading bf01n014, bf01n013 cl1516 rg1516 cascading bf01n015, bf01n016 rg1615 cascading bf01n016, bf01n015 Add resource groups rg1314 and rg1413 to bf01n013 and resource groups rg1516 and rg1615 to bf01n015: # smit hacmp > Cluster Configuration > Cluster Resources > Define Resource Groups > Add a Resource Group Add a Resource Group * Resource Group Name [rg1516] * Node Relationship cascading * Participating Node Names [bf01n015 bf01n016] Alternatively, enter the following command: /usr/sbin/cluster/utilities/claddgrp -g rg1516 -r ’cascading’ -n bf01n015 bf01n016 5.7 Add an application server With our configuration, we will be adding four application servers, two per cluster, as described in Table 4. Table 4. Application server scripts Cluster Server name start script stop script cl1314 as1314 /usr/bin/rc.db2pe.13.start /usr/bin/rc.db2pe.13.stop as1413 /usr/bin/rc.db2pe.14.start /usr/bin/rc.db2pe.14.stop 26 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  37. 37. Cluster Server name start script stop script cl1516 as1516 /usr/bin/rc.db2pe.15.start /usr/bin/rc.db2pe.15.stop as1615 /usr/bin/rc.db2pe.16.start /usr/bin/rc.db2pe.16.stop The application server scripts must be accessible and executable from both nodes in the cluster. They are not required to have the same content. The scripts must be created by the user as HACMP does not setup the scripts. # smit hacmp > Cluster Configuration > Cluster Resources > Define Application Servers > Add an Application Server Add an Application Server Server Name [as1516] Start Script [/usr/bin/rc.db2pe.15.start] Stop Script [/usr/bin/rc.db2pe.15.stop] When setting up the cl1314 cluster, we need to take into consideration that the $HOME for the instance is a resource in the cluster. With this in mind, we will set up the application servers as follows: The contents of /usr/bin/rc.db2pe.13.start are: /usr/bin/rc.db2pe svtha1 NFS SERVER start /usr/bin/rc.db2pe svtha1 130 140,141 start and the contents of /usr/bin/rc.db2pe.13.stop are: /usr/bin/rc.db2pe svtha1 130 140,141 stop /usr/bin/rc.db2pe svtha1 NFS SERVER stop The contents of /usr/bin/rc.db2pe.14.start are: /usr/bin/rc.db2pe svtha1 140 130,131 start and the contents of /usr/bin/rc.db2pe.14.stop are: /usr/bin/rc.db2pe svtha1 140 130,131 stop The contents of /usr/bin/rc.db2pe.15.start are: /usr/bin/rc.db2pe svtha1 150 160,161 start and the contents of /usr/bin/rc.db2pe.15.stop are: /usr/bin/rc.db2pe svtha1 150 160,161 stop The contents of /usr/bin/rc.db2pe.16.start are: /usr/bin/rc.db2pe svtha1 160,161 150 start and the contents of /usr/bin/rc.db2pe.16.stop are: /usr/bin/rc.db2pe svtha1 160,161 150 stop Chapter 5. HACMP setup 27
  38. 38. The syntax of rc.db2pe for DB2 database partitions is: rc.db2pe <instance> <partition for the primary node> <partition for the secondary node> <start | stop> or for the NFS server nodes: rc.db2pe <instance> NFS SERVER < start | stop > The primary purpose of the rc.db2pe script is to construct and execute the db2start command with the correct restart parameters, to enable the DB2 database partitions to recover during a failover and failback. Refer to section 6.7, “SQL6030 RC=15, no port 0 defined in db2nodes.cfg file” on page 38 for additional information on proper use of the db2start command with the restart option. 5.8 Configure resources for the resource group Use Table 5 as a reference when configuring the resources. Table 5. Resource group configuration information Resource Service IP Filesystems Volume Application group group servers rg1314 swserv13 /db1ha/svtha1/NODE0130 havg1314 as1314 /db1ha/svtha1/NODE0131 /homehalocal rg1413 swserv14 /db1ha/svtha1/NODE0140 havg1413 as1413 rg1516 swserv15 /db1ha/svtha1/NODE0150 havg1516 as1516 rg1615 swserv16 /db1ha/svtha1/NODE0160 havg1615 as1615 /db1ha/svtha1/NODE0161 # smit hacmp > Cluster Configuration > Cluster Resources > Change/Show Resources for a Resource Group < select the Resource Group > Note: You only need to identify file systems to be taken over; logical volumes (raw devices) are taken over implicitly as part of the volume group. 5.9 Synchronize cluster resources Synchronize cluster resource information on all cluster nodes defined in the local topology database: > Cluster Configuration > Cluster Resources > Synchronize Cluster Resources 28 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  39. 39. Configure a Resource Group Resource Group Name rg1516 Node Relationship cascading Participating Node Names bf01n015 bf01n016 Service IP Label [swserv15] + HTY Service IP Label [] Filesystems [/db1ha/svtha1/NODE0150] + Filesystems Consistency Check fsck + Filesystems Recovery Method sequential + Filesystems to Export [] + Filesystems to NFS Mount [] + Volume Groups [havg1516] + Concurrent Volume Groups [] + Raw Disk PVIDs [] + AIX Connections Services [] + Application Servers [as1516] + Miscellaneous Data [] Inactive Takeover Activated false + 9333 Disk Fencing Activated false + SSA Disk Fencing Activated false + Filesystems mounted before IP configured false + Synchronize Cluster Resources Ignore Cluster Verification Errors? [No] Un/Configure Cluster Resources? [Yes] Emulate or Actual? [Actual] Alternatively, enter the following command: /usr/sbin/cluster/utilities/cldare -r 5.10 Show resource information by resource group Show resource configuration associated with the group name: > Cluster Configuration > Cluster Resources > Show Cluster Resources > Show Resource Information by Resource Group Chapter 5. HACMP setup 29
  40. 40. < select the Resource Group > COMMAND STATUS Command: OK stdout: yes stderr: no Before command completion, additional instructions may appear below. Resource Group Name rg1516 Node Relationship cascading Participating Node Name(s) bf01n015 bf01n016 Service IP Label swserv15 HTY Service IP Label Filesystems /db1ha/svtha1/NODE0150 Filesystems Consistency Check fsck Filesystems Recovery Method sequential Filesystems to be exported Filesystems to be NFS mounted Volume Groups havg1516 Concurrent Volume Groups Disks AIX Connections Services Application Servers AS1516 Miscellaneous Data Inactive Takeover false 9333 Disk Fencing false SSA Disk Fencing false Filesystems mounted before IP configured false Run Time Parameters: Node Name bf01n015 Debug Level high Host uses NIS or Name Server false Node Name bf01n016 Debug Level high Host uses NIS or Name Server false Alternatively, enter the following command: /usr/sbin/cluster/utilities/clshowres -g rg1516 Note: This is done as a cross reference to ensure prior steps were completed correctly. 5.11 Verify cluster Verify cluster topology, resources, and custom-defined verification methods: # smit hacmp > Cluster Configuration > Cluster Verification > Verify Cluster Verify Cluster Base HACMP Verification Methods both (Cluster topology, resources, both, none) Custom Defined Verification Methods [All] Error Count [] Log File to store output [] Alternatively, enter the following command: 30 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  41. 41. /usr/sbin/cluster/diag/clconfig -v ’-tr’ -m ’All’ Your DB2 UDB and HACMP/ES setup is complete. Note: Be sure to start and stop the cluster using the HACMP commands smit clstart and smit clstop respectively. Chapter 5. HACMP setup 31
  42. 42. 32 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  43. 43. Chapter 6. Troubleshooting This chapter documents some hints and tips to address some situations that may occur in an HACMP and DB2 UDB EEE V7.2 installation and configuration. 6.1 SQL6048 on db2start command If you receive SQL6048 from db2start, you not only need to make sure the $HOME/.rhosts file is created with the correct entries, you must also ensure that it has the correct permissions. The following is taken from the man page entry for rsh from the AIX system man pages: If rsh is to consult an .rhosts file on the remote machine, the file must have UNIX protections no more liberal than -rw-r--r--. If .rhosts resides in a user home directory in AFS, the home directory must also grant the LOOKUP and READ rights to system:anyuser. To help narrow the area that needs correcting, use the db2_all date command. If you get Permission denied messages for each of the partitions, try using the rsh date command. If you get a Permission denied message from AIX, then the .rhost file is most likely the problem. For example, if the .rhosts file has the following permissions set: -rwxrwxrwx 1 svtha1 build 192 Feb 26 10:38 .rhosts when you issue the db2start command you will get the following output: 05-23-2001 09:12:04 130 0 SQL6048N A communication error occurred during START or STOP DATABASE MANAGER processing. 05-23-2001 09:12:05 131 0 SQL6048N A communication error occurred during START or STOP DATABASE MANAGER processing. 05-23-2001 09:12:06 140 0 SQL6048N A communication error occurred during START or STOP DATABASE MANAGER processing. 05-23-2001 09:12:07 150 0 SQL6048N A communication error occurred during START or STOP DATABASE MANAGER processing. 05-23-2001 09:12:08 160 0 SQL6048N A communication error occurred during START or STOP DATABASE MANAGER processing. 05-23-2001 09:12:10 161 0 SQL6048N A communication error occurred during START or STOP DATABASE MANAGER processing. SQL1032N No start database manager command was issued. SQLSTATE=57019 If you issue the db2_all command, you will get a Permission denied message for each partition. Now try using the rsh command: rsh bf01n015 date You will get the same permission denied from the rsh command. Change the permissions on the .rhosts file with the chmod command: chmod 600 .rhosts Now when you issue the rsh command it should return the response to you. The db2_all date command will return the responses for each of the active partitions. The db2start command should now work assuming everything else is working correctly in the cluster. © Copyright IBM Corp. 2001 33
  44. 44. 6.2 SQL6031 returned when issuing db2 “? SQL6031” command This error could indicate a problem with the entries in the db2node.cfg file. Following is an example: $ db2start SQL6031N Error in the db2nodes.cfg file at line number “2”. Reason code “9”. $ db2 “? sql6031 “ SQL6031N Error in the db2nodes.cfg file at line number “2”. Reason code “9”. $ cat db2nodes.cfg 130 b_sw_013 1 b_sw_013 131 b_sw_013 1 b_sw_013 140 b_sw_014 0 b_sw_014 150 b_sw_015 0 b_sw_015 160 b_sw_016 0 b_sw_016 161 b_sw_016 1 b_sw_016 The fix for db2nodes.cfg would be: 130 b_sw_013 0 b_sw_013 131 b_sw_013 1 b_sw_013 140 b_sw_014 0 b_sw_014 150 b_sw_015 0 b_sw_015 160 b_sw_016 0 b_sw_016 161 b_sw_016 1 b_sw_016 Following is the entire output for the db2 “? SQL6031” command: $ db2 “? sql6031 “ SQL6031N Error in the db2nodes.cfg file at line number “<line>”. Reason code “<reason code>”. Explanation: The statement cannot be processed because of a problem with the db2nodes.cfg file, as indicated by the following reason codes: (1) Cannot access the sqllib directory of the instance. (2) The full path name added to the db2nodes.cfg filename is too long. (3) Cannot open the db2nodes.cfg file in the sqllib directory. (4) A syntax error exists at line “<line>” of the db2nodes.cfg file in the sqllib directory. (5) The nodenum value at line “<line>” of the db2nodes.cfg file in the sqllib directory is not valid. (6) The nodenum value at line “<line>” of the db2nodes.cfg file 34 IBM® DB2® UDB EEE for AIX® and HACMP/ES (TR-74.174)
  45. 45. in the sqllib directory is out of sequence. (7) The nodenum value at line “<line>” of the db2nodes.cfg file in the sqllib directory is not unique. (8) The port value at line “<line>” of the db2nodes.cfg file in the sqllib directory is not valid. (9) The hostname/port couple at line “<line>” of the db2nodes.cfg file in the sqllib directory is not unique. (10) The hostname at line “<line>” of the db2nodes.cfg file in the sqllib directory is not valid. (11) The port value at line “<line>” of the db2nodes.cfg file in the sqllib directory is not defined for your DB2 instance id in the services file (/etc/services on UNIX-based systems). (12) The port value at line “<line>” of the db2nodes.cfg file in the sqllib directory is not in the valid port range defined for your DB2 instance id in the services file (/etc/services on UNIX-based systems). (13) The hostname value at line “<line>” of the db2nodes.cfg file in the sqllib directory has no corresponding port 0. (14) A db2nodes.cfg file with more than one entry exists, but the database manager configuration is not MPP. (15) The netname at line “<line>” of the db2nodes.cfg file in the sqllib directory is not valid. User Response: The action corresponding to the reason code is: (1) Ensure that the $DB2INSTANCE userid has the required permissions to access the sqllib directory of the instance. (2) Make the instance home directory path name shorter. (3) Ensure that the db2nodes.cfg file exists in the sqllib directory and is not empty. (4) Ensure that at least 2 values are defined per line in the db2nodes.cfg file and that the file does not contain blank lines. (5) Ensure that the nodenum value defined in the db2nodes.cfg file is between 0 and 999. (6) Ensure that all the nodenum values defined in the db2nodes.cfg file are in ascending order. (7) Ensure that each nodenum value defined in the db2nodes.cfg file is unique. (8) Ensure that the port value is between 0 and 999. Chapter 6. Troubleshooting 35

×