More Related Content
Similar to Chapter5_PlatformUsersGuide
Similar to Chapter5_PlatformUsersGuide (20)
Chapter5_PlatformUsersGuide
- 1. Copyright © 1997-2008 Platform Manager 5.7.1 User’s Guide 1
Chapter 5 - High Availability
The high availability feature allows you to configure failover protection for a Platform Manager Server and a
gateway node, where additional services can be protected.
5.1 Introduction to high availability
By nature, HA cannot be an “out-of-the-box” feature. You and your vendors will need to design the topology of
back-up servers that you want to use and then configure the high availability feature accordingly. It will take on
average two to three days per installation.
The high availability (HA) option provides fault tolerance for
The Platform Manager Server
The Cluster Gateways set up by Platform Manager
Other services which are compliant with standards specifications such as HTTP, NFS, Samba, etc.
Note: Workload Management systems (such as PBS Pro) often have their own
HA options. You need to buy your vendors’ own HA option for their product line. Please
contact them for their solutions.
Platform Manager employs the Active-Passive model of HA. This means that there is a secondary node with a fully
redundant instance of the Platform Manager Server and/or one secondary node for each protected gateway, nodes
that lie passively offline until their associated primary nodes fail. This configuration requires the greatest additional
hardware of all the topologies, but it also assures the greatest protection against systems failure.
To access services on the cluster during a failure of the cluster host there must be a “cluster logical host.” This is a
network address or a host name that is not tied solely to any given node, but rather linked to services provided by the
cluster. This allows you to restart the database on a redundant cluster during failure. You temporarily assign that
network address/host name to the redundant node so users may interact with the database.
WARNING — Use non-volatile shared storage (NAS/SAN) to store as much of the application
state as possible so that the application can restart in its last state before the failure, on another
node. The Platform Manager Server itself requires a hardware-based shared storage solution
(such as SAN) to be present before you can set up the HA option. You and your hardware vendor
must establish a shared storage solution before attempting to deploy the HA feature. It will not
be possible to implement the HA option without meeting this requirement.
Furthermore, applications in a high availability cluster environment must satisfy these technical requirements
Ease of application start, stop, and force-stop.
Monitoring the status of the application
Support multiple instances of the application.
Scripting or CLI
Data must not be corrupted if a node crashes or restarts the application from a saved state.
You must observe Licensing compliance.
Note: The Platform Manager/HA option requires a product key for the HA feature on each
cluster. The copies of the key must be set to allow for activation on each node instance AND
activated. Please obtain licenses from your Platform sales representative or visit
http://www.platform.com.
Normally, you would install the high availability feature on the Platform Manager Server and/or for gateways in
your clusters. Compute nodes do not generally need this functionality.
5.2 Installing high availability on a gateway
We will assume the following values for installation of the high availability feature on your gateway.
PM_VERSION="5.7.1"
- 2. Copyright © 1997-2008 Platform Manager 5.7.1 User’s Guide 2
PRIMARY="dl360g3-2"
PRIMARY_EXT_IP="172.19.5.21"
PRIMARY_INT_IP="11.0.0.21"
SECONDARY="dl360g3-3"
SECONDARY_EXT_IP="172.19.5.22"
SECONDARY_INT_IP="11.0.0.22"
HAGROUPNAME="HAgateway"
HAGROUP_EXT_IP="172.19.5.200"
HAGROUP_INT_IP="11.0.0.200"
UDPPORT="10694"
PRIVATENODES="dl360g3-6"
WARNING — Do not use the floating IPs for heartbeat channels.
Note: Make sure to follow the additional steps if you already have installed systems.
Figure 123 — high availability Topology on a gateway
To install high availability functionality on a cluster gateway:
1. Do a normal bootstrap of the Platform Manager Server.
2. Define primary Platform Manager Server and private nodes in the Server Creation Wizard (private cluster)
or pmcli. See “Creating a node with pmcli” on page 344.
3. Define the secondary node using the Server Creation Wizard (extend cluster /independent server) or the
pmcli.
4. OPTIONAL: If some, or all defined systems are already installed, make sure all systems are installed and
working properly. It is also possible to install all systems defined above at this point before continuing.
Make sure to follow the optional steps also if you have already installed systems.
5. Create the HA group and add the floating Ethernet interfaces. The IP addresses of these interfaces will start
on the active gateway and move in case of a failover.
Note: The use of ethX in the addhaethernetinterface will determine which of the physical
interfaces on the active gateway will start the floating IP’s.
6. Add the primary HA group member (primary HA gateway)
- 3. Copyright © 1997-2008 Platform Manager 5.7.1 User’s Guide 3
7. Add heartbeat channels to the primary HA gateway. We recommend using the “unicast” and
“serial” methods and using as many different channels as possible for redundancy.
Note: Do not use floating IP’s for heartbeat channels
8. Set the udp-port if necessary. Default value is 694. There are two common reasons for
overriding this value:
There are multiple HA groups using the “broadcast” channel method on the same subnet, or
This port is already in use in accordance with some locally established policy.
9. Move the services to the HA group. This controls the choice of services that should be run the gateway as
HA services.
10. Start HA on the primary gateway. This will enable a HA group running Heartbeat with one
member system.
11. Set the gateway of the private nodes to be the internal floating HA group IP address
12. Configure the nodes on the private net to use the new gateway IP.
Note: Everything should now work normally as if HA was never there.
13. Add the secondary HA group member (secondary HA gateway)
14. Add heartbeat channels to the secondary HA gateway. See note on udp-port above.
15. Install the software on the secondary node for the HA services selected. This is a crucial step to
make HA work. If a failover occurs and the software fails to install on the secondary gateway,
the failover procedure will fail and both gateways will go down. Which software should be
installed is related to the use of “moveservices” command(s) above.
16. Reconfigure the primary gateway and install the secondary gateway if not installed earlier.
17. If the secondary gateway was already installed, reconfigure the gateway systems
- 4. Copyright © 1997-2008 Platform Manager 5.7.1 User’s Guide 4
5.3 Installing and Configuring High Availability on the Platform Manager Server
In this chapter’s examples, we will assume the following values for installation of the high availability feature on
your Platform Manager Server:
PM_VERSION="5.7.1"
PRIMARY="dl360g3-1"
PRIMARY_EXT_IP="172.19.5.20"
PRIMARY_INT_IP="11.0.0.20"
SECONDARY="dl360g3-2"
SECONDARY_EXT_IP="172.19.5.21"
SECONDARY_INT_IP="11.0.0.21"
HAGROUPNAME="HAPMServer"
HAGROUP_EXT_IP="172.19.5.200"
HAGROUP_INT_IP="11.0.0.200"
UDPPORT="11694"
PRIVATENODES="dl360g3-6"
JDK_RPM="jdk-1_5_0_12-linux-i586.rpm"
CAUTION — Read Installing Platform Manager high availability on a Gateway and understand the
procedure before reading this!
Figure 124 — High Availability Topology on a Server
Note: Make sure to follow the additional steps if you have already installed systems.
1. Do a normal bootstrap of the primary Platform Manager Server.
2. OPTIONAL: Set up NAT on the primary Platform Manager Server. See “Server Creation Wizard:
Configuring a Network Gateway” on page 37, or See “About the NAT Settings Tab” on page 106, or
See “addnatservice” on page 279.
3. Add the HA product key if you have not already done this.
4. OPTIONAL: put the following file systems on external storage to migrate to HAPMServer
- 5. Copyright © 1997-2008 Platform Manager 5.7.1 User’s Guide 5
5. Enable power and console control on the primary Platform Manager Server.
6. Define the SECONDARY node with the Server Creation Wizard (independent servers) or pmcli.
WARNING — Do not run pmgui during the HA Platform Manager Server setup!
OPTIONAL: If some or all defined systems (in addition to primary Platform Manager Server) are already
installed, make sure all systems are installed and working properly.
Note: It is now possible to install all systems defined above, before continuing.
7. Create the HA group
8. Add the floating Ethernet interfaces
9. Add the primary Platform Manager Server with heartbeat channels.
10. Install JDK on PRIMARY and reconfigure the primary to enable the new floating IP(s)
11. Move the services to the Platform Manager Server HA group. See the partial script below for how to move
the services individually.
12. Move the Platform Manager Configuration Database to shared storage. Mount the shared storage manually
and start the DB again.
13. Add the shared disk mount for the Platform Manager Configuration Database to the HA group.
14. Move the repository to shared storage.
15. Move tftpboot
- 6. Copyright © 1997-2008 Platform Manager 5.7.1 User’s Guide 6
16. Move images
17. OPTIONAL: Move console logs
18. Add the shared disk mount to the HA group:
19. Start HA on the primary Platform Manager Server
21. Add the secondary Platform Manager Server to the HA group and heartbeat channels.
22. Install the Platform Manager software
23. Add vLAN interfaces for connecting to your local shared storage and add the scance iSCSI plugin
- 7. Copyright © 1997-2008 Platform Manager 5.7.1 User’s Guide 7
24. Reconfigure the primary Platform Manager Server and install the secondary Platform Manager Server if
not previously installed.
25. If the secondary Platform Manager Server was previously installed:
pmcli reconfigure ${PRIMARY}