SlideShare a Scribd company logo
1 of 15
Download to read offline
Cluster Control Protocol Reference
NG FP3
For additional technical information about Check Point products, consult Check Point’s
SecureKnowledge database at
http://support.checkpoint.com/kb/
Preface
Introduction
This document explores various technical aspects of the Cluster Control Protocol as utilized by ClusterXL. Although
parts of the Cluster Control Protocol are also used by OPSEC High Availability products, this aspect will not be covered.
This document is not meant as an installation guide, and assumes the reader has a working knowledge of the ClusterXL
product.
Overview
The introduction of ClusterXL into an existing network, often implies that certain changes to that network be made. Such
alterations are needed to accommodate both the clustered topology, as well as the Cluster Control Protocol itself.
Understanding these changes is important for the purposes of planning, implementation, monitoring, and troubleshooting.
Moreover, adequate comprehension of the ClusterXL decision making process is needed. Otherwise, there will be no
context in which to place observed cluster behavior.
Therefore, the enclosed sections will explore the following topics:
s Implementation Planning
s Cluster Control Protocol Overview
s Cluster Control Protocol Logic
Implementation Planning
NOTE: Due to the enhanced Hot/Standby configuration available in FP3, Legacy HA will not be covered in the
following sections.
High Availability (New Mode)
FP3 introduces a new form of operation for High Availability. Simply referred to as "New Mode", it offers all the
topology advantages of Load Sharing, while maintaining a Hot/Standby orientation.
Important factors to consider while planning for a New Mode implementation are:
s Switch support/configuration for layer two multicast forwarding
s VLAN configuration
s IP address migration
s SmartCenter/CMA location
Switch Support
The Cluster Control Protocol used by both New Mode, and Load Sharing configurations, makes use of layer two
multicast. In keeping with multicast standards, this multicast address is used only as the destination, and is used in all
CCP packets sent on "non-secured" interfaces.
A layer two switch connected to non-secured interfaces, must be capable of forwarding multicast packets to ports within
that VLAN. It is acceptable that the switch forward such traffic to all ports within the given VLAN. However, it is
considered more efficient to forward to only those ports connecting cluster members.
The steps needed to enable multicast support will vary according to the switch vendor, and model. Please check your
switch documentation for details.
If the connecting switch is incapable of forwarding multicast, CCP can be changed to use broadcast instead. To toggle
between these two modes use the command (mode survives a reboot):
'cphaconf set_ccp broadcast/multicast'
VLAN Configuration
It is not recommended to connect the non-secured interfaces of multiple clusters to the same VLAN. Doing so will cause
the connecting switch ports to flap. If such a need exists, a separate VLAN, and/or switch will be needed for each cluster.
A HotFix is also available from Check Point support which allows this configuration to be supported in FP3.
Connecting together the secured interfaces of multiple clusters is also not recommended for the same reason. While the
above mentioned HotFix may be used, there are additional concerns with this configuration which make it currently
unsupportable. Therefore, it is best to connect the secured interfaces of a given cluster via a crossover link when possible,
or to an isolated VLAN.
IP Address Migration
It is reasonable to assume that many ClusterXL installs will be either for a new VPN-1/FireWall-1 cluster, or to replace a
different clustering solution. However, it is also reasonable to assume that many will be to provide high availability to an
existing single gateway configuration.
In the latter case, existing NAT, and IPSec connections will need to be altered to accommodate the new clustered
landscape. Therefore, it is recommended to take the existing IP addresses from the current gateway, and make these the
cluster owned VIP's, or cluster addresses when feasible. Doing so will avoid altering current IPSec endpoint identities, as
well keep Hide NAT configurations the same in many cases.
SmartCenter/CMA Location
A SmartCenter/CMA Server, can install a Security Policy to one or more clusters with only a single install action. It does
so by installing the Security Policy to each cluster member, using the general tab IP of each cluster member object as the
recipient. This is true regardless of the IP address(es) of the cluster object itself.
This design affords a great level of flexibility when using either New Mode, or Load Sharing configurations, as the
SmartCenter/CMA Server can now reside on any given IP segment. One only needs to ensure that the general tab IP
address of the cluster member object is reachable. If not, simply choose one which will be accessible to the
SmartCenter/CMA Server.
Load Sharing
Important factors to consider while planning for a Load Sharing implementation are:
s Switch support/configuration for layer two multicast forwarding
s Router support for multicast
s VLAN configuration
s IP address migration
s SmartCenter/CMA location
Switch Support
The Cluster Control Protocol used by both New Mode, and Load Sharing configurations, makes use of layer two
multicast. In keeping with multicast standards, this multicast address is used only as the destination, and is used in all
CCP packets sent on "non-secured" interfaces.
A layer two switch connected to non-secured interfaces, must be capable of forwarding multicast packets to ports within
that VLAN. It is acceptable that the switch forward such traffic to all ports within the given VLAN. However, it is
considered more efficient to forward to only those ports connecting cluster members.
The steps needed to enable multicast support will vary according to the switch vendor, and model. Please check your
switch documentation for details.
If the connecting switch is incapable of forwarding multicast, CCP can be changed to use broadcast instead. To toggle
between these two modes use the command (mode survives a reboot):
'cphaconf set_ccp broadcast/multicast'
Router Support
In addition to the use of multicast by CCP, Load Sharing associates a multicast MAC for each configured cluster IP. This
design ensures that traffic destined to the cluster is received by all members.
Therefore, ARP replies sent by a cluster member will indicate that the unicast cluster IP, is reachable via a multicast
MAC. Some routing devices are incapable of receiving such ARP replies. For instance, all versions of Cisco IOS do not
include such support. In such cases, adding a static ARP entry for the cluster IP on the routing device will solve the issue.
Even still there are some routers such as Extreme routers, Avia routers, and some Nortel models (Passport 1200 or XLR)
which will not accept this type of static ARP entry. For those cases, ClusterXL FP4 will introduce a new mode of
operation referred to as Pivot mode. Pivot mode operates as a Load Sharing cluster, but without the need of multicast for
the cluster addresses.
VLAN Configuration
It is not recommended to connect the non-secured interfaces of multiple clusters to the same VLAN. Doing so will cause
the connecting switch ports to flap. If such a need exists, a separate VLAN, and/or switch will be needed for each cluster.
A HotFix is also available from Check Point support which allows this configuration to be supported.
Connecting together the secured interfaces of multiple clusters is also not recommended for the same reason. While the
above mentioned HotFix may be used, there are additional concerns with this configuration which make it currently
unsupportable. Therefore, it is best to connect the secured interfaces of a given cluster via a crossover link when possible,
or to an isolated VLAN.
IP Address Migration
It is reasonable to assume that many ClusterXL installs will be either for a new VPN-1/FireWall-1 cluster, or to replace a
different clustering solution. However, it is also reasonable to assume that many will be to provide high availability to an
existing single gateway configuration.
In the ladder case, existing NAT, and IPSec connections will need to be altered to accommodate the new clustered
landscape. Therefore, it is recommended to take the existing IP addresses from the current gateway, and make these the
cluster owned VIP's, or cluster addresses when feasible. Doing so will avoid altering current IPSec endpoint identities, as
well keep Hide NAT configurations the same in many cases.
SmartCenter/CMA Location
A SmartCenter/CMA Server, can install a Security Policy to one or more clusters with only a single install action. It does
so by installing the Security Policy to each cluster member, using the general tab IP of each cluster member object as the
recipient. This is true regardless of the IP address(es) of the cluster object itself.
This design affords a great level of flexibility when using either New Mode, or Load Sharing configurations, as the
SmartCenter/CMA Server can reside on any given IP segment. One only needs to ensure that the general tab IP address
of the cluster member object is reachable. If not, simply choose one which will be accessible to the SmartCenter/CMA
Server.
Cluster Control Protocol
CCP Overview
The Cluster Control Protocol serves an integral role to the operation of ClusterXL. Specifically, CCP is responsible for
the following:
s Health status reports
s Cluster member probing
s State change commands
s Querying for cluster membership
s Sate table synchronization
Health Status Reports
CCP will report the status of a cluster member roughly three times a second, per interface. These reports contain state of
the transmitting cluster member, as well as the presumed state of other cluster members.
Cluster Member Probing
If a cluster member fails to receive status for another member on a given segment, CCP will probe that segment in an
attempt to illicit a response. The purpose of such probes is to detect the nature of possible interface failures, and to
determine which module has the problem. The outcome of this probe will determine what action is taken next.
State Change Commands
If a cluster member wishes to change state, the command to do so takes place on the defined secured interface.
Querying Cluster Membership
When a cluster member comes online, such as with a reboot, it will send as series of CCP query/response messages to
gain knowledge of it's cluster membership.
State Table Synchronization
When state synchronization is enabled, connection information is updated between cluster members on the defined
secured interface.
CCP Message Format
The Cluster Control Protocol payload is made up a general heading, and one of a series of message types, with each
having it's own unique purpose, format, and content.
General Heading
This portion contains information necessary for the processing of the encapsulated message type, the most important of
which is:
s Cluster ID - unique identifier shared amongst all members of a given cluster
s Protocol Version - version and Feature Pack revision
s Source Interface - transmitting interface number as recognized by the OS kernel
s Source Machine ID - member identification according to configured priority. Calculated as priority -1=ID
s Policy ID - Last two bytes of MD4 Policy ID
Message Types
Below is a complete listing of the possible CCP message types with description:
s FWHA_MY_STATE - Report source machine's state
s FWHA_Query_STATE - Query other machine's state
s FWHA_IF_PROBE_REQ - Interface active check request
s FWHA_IF_PROBE_RPLY - Interface active check reply
s FWHA_IFCONF_REQ - Interface configuration request
s FWHA_IFCONF_REPLY - Interface configuration reply
s FWHA_POLICY_CHANGE - Policy ID change request/notification
s FWHAP_SYNC - New Sync packet
CCP Transmission
Non-Secured Interfaces
For interfaces not defined as secured (non synchronization interfaces), CCP transmits it's packets by default with layer two
multicast. The addressable fields are as follows:
s Source MAC - 00:00:00:00:fe:<Source Machine ID>
s Source IP - 0.0.0.0
s Destination MAC - 01:00:5e:<cluster IP concatenation of bits 9-24>
s Destination IP - network broadcast address
As an example, lets assume a scenario in which New Mode is being used. On a given segment, the cluster IP is
10.3.220.103/27. CCP packets sent by the highest priority machine will look like this:
0:0:0:0:fe:0 1:0:5e:3:dc:67 ip 78: 0.0.0.0 > 10.3.220.96
Here, 1:0:5e:3:dc:67 corresponds to the destination MAC, and indicates that the OID is multicast (1:0:5e:), with the rest
corresponding to the last three octets of the cluster address.
The last octet of the source address 0:0:0:0:fe:0, indicates the Machine ID of the transmitting member, in this case this
primary. In case of a second, or third member, this source address will reflect the members priority such as:
0:0:0:0:fe:1
0:0:0:0:fe:2
Using this design, both the destination MAC, and destination IP address will change per IP segment, according to both
the cluster, and network address.
Secured Interfaces
For interfaces defined as secured (synchronization interfaces), CCP transmits by default as follows:
s Source MAC - 00:00:00:00:fe:<Source Machine ID>
s Source IP - 0.0.0.0
s Destination MAC - ff:ff:ff:ff:ff:ff (all hosts broadcast)
s Destination IP - network broadcast address
Port usage
CCP uses UDP as the transmission protocol, with both the source and destination port set to 8116. This is true
irrespective of the interface type.
ClusterXL Decision Logic
Topology
The following section will explore the logic utilized by ClusterXL. We will do so by looking at several common
failures, and how ClusterXL responds to such scenarios. A separate section will be dedicated to both New Mode High
Availability, and to Load Sharing configurations.
Note: The following examples are given in general terms, and do not represent a per packet analysis.
The topology represented by Figure 1.1 will be assumed.
Figure 1.1
Topology Legend
s Dallab_Cluster - ClusterXL cluster
s P1_Primary - Managing CMA
s Net_10.3.220.96 - External segment
s Net_10.2.220.96 - Admin network
s Net_10.1.220.96 - Corporate network
s Net_192.168.250.8 - Sync (Secured) network
High Availability (New Mode)
Interface Failure
This scenario assumes two cluster members, in which the external interface of the primary has failed.
1. At the point of failure, the primary will recognize that no CCP messages have been heard on the failed interface. As
such, it will announce via FWHA_MY_STATE on all other segments, that there may be an issue in the inbound direction
with one of it's interfaces.
2. The primary will also note that no CCP responses have been received on the failed interface. This causes the primary
to then announce on all other segments via FWHA_MY_STATE, that the outbound direction for one of the it's interfaces
is in question as well.
3. At the same time as the above events, the secondary will recognize that no CCP packets have been received, and
begins sending FWHA_PROBE_REQ messages on the affected segment. In addition, the secondary will attempt ARP
requests to hosts belonging to the affected segment, and will begin pinging those hosts which respond. This is done in an
attempt to diagnose which member has the problem.
The pings will continue as long as we cannot identify by other means (i.e. CCP packets) that the interface is alive. This
will happen when there are N cluster members, and N-1 of them are down. When more than two members are present,
such pings will only be issued if all other cluster members do not respond to CCP probing.
4. Since no FWHA_PROBE_RPLY message is received as a response, but the ping requests are being answered, the
secondary concludes that it's own interfaces are up and working, and that the interface of the primary has failed.
Therefore, it announces via FWHA_MY_STATE, that all of it's own interfaces are operational.
5. With this report from the secondary, the primary concludes the issue is with it's own interface, and move to the
"Down/Dead" status.
6. The secondary issues gratuitous ARP's for both the physical, and cluster address per IP segment, and moves to the
"Active/Active-Attention" state.
Primary Reboot
1. As the primary goes down, it changes it's state to "Down/dead", and announces this as part of FWHA_MY_STATE.
2. This triggers the secondary to prepare itself to become the active member. It does so by sending as series of gratuitous
ARP's for both it's physical IP, and cluster IP for each clustered segment. This will update all necessary hosts/routers on
each segment with the relevant updated MAC address information.
3. The secondary now moves to the "Active/Active-Attention" state, and assumes responsibility for processing all
connections.
4. Though the primary is now considered "Down/dead", it will still be able to send/receive CCP packets until its'
interfaces are brought down. Once this occurs, the secondary, which is now in the "Active/Active-Attention" state, will
make notice of the fact that no CCP packets are being received.
5. The secondary will do several things in an effort to ascertain why no CCP packets are being received. First, it sends
FWHA_IF_PROB_REQ packets on all segments in which no CCP packets have been heard. This is to illicit a response
from any member capable of responding. The secondary will ARP on each segment for IP's belonging to that segment,
and ping those hosts which respond.
The pings will continue as long as we cannot identify by other means (i.e. CCP packets) that the interface is alive. This
will happen when there are N cluster members, and N-1 of them are down. When more than two members are present,
such pings will only be issued if all other cluster members do not respond to CCP probing.
6.Once the primary has rebooted, but before the policy is loaded, is will begin sending FWHA_IFCONF_REPLY packets
regularly. It does so without knowing what Cluster it belongs to, so a random ID is used.
7. After the Primary learns the cluster ID, it begins announcing FWHA_MY_STATE. The primary at this stage
announces itself as "Down/Dead".
8. The primary fetches the policy from another cluster member if possible, otherwise from the management server.
9. The primary now initiates full synchronization on the secured interface via the FW1 protocol
10. Once synchronization is complete, the primary moves to the "Ready" state.
11. The secondary acknowledges this by moving to the "Standby" state.
12.Once this state has been acknowledged, the primary issues gratuitous ARP's for both the physical, and cluster IP for
each segment, and now moves to the "Active/Active-Attention" state.
Registered Device Failure
This scenario assumes two cluster members, in which the fwd daemon has failed on the primary.
1. Once the fwd daemon has died, this is detected by Cluster XL as the device is no longer reporting state. The
primary changes it's state to "Down/dead", and announces this as part of FWHA_MY_STATE.
2. This triggers the secondary to prepare itself to become the active member. It does so by sending as series of gratuitous
ARP's for both the physical, and cluster IP for each segment. This will update all necessary hosts/routers on
each segment with the relevant updated MAC address information.
3. The secondary now moves to the "Active/Active-Attention" state, and assumes responsibility for processing all
connections.
In this case, the primary is still able to send CCP hello packets, and will continue to do so. Because of this, the secondary
will not make any attempts to diagnose interface related issues such as the pinging of hosts. This differs from an interface
failure where CCP messages would not be received, which would trigger such a diagnosis using our topology.
Dual Failure
This scenario assumes a dual failure by both cluster members of the secured (synchronization) interface connected via a
crossover link.
1. Since it is assumed that the secured interfaces are connected via a crossover link, the failure of one interface will bring
down the line protocol of the other resulting in a dual failure. Once this occurs, both members will become aware of this
fact via CCP. Both members will announce as part of FWHA_MY_STATE that N-1 interfaces are up.
2. Since both members have suffered the loss of a single interface, a decision must be made as to what action to take
next. Bringing both members down will result in a total failure, but some level of disturbance as already occurred. The
solution is for the highest priority member to remain in the "Active/Active Attention" state, and for the secondary to
report itself as "Down/Dead".
3. The necessary state changes are made, and announced as part of FWHA_MY_STATE.
4. Upon recovery of the secured link, the secondary will resume the "Standby" status.
Load Sharing
The events carried out by CCP during various failures in Load Sharing mode, closely resembles those covered thus far in
the previous New Mode section. For this reason, a complete analysis will not be given. However, there are some
important differences which should be noted.
1. As opposed to New Mode, all members of a Load Sharing cluster will remain in the "Active/Active Attention" state
during normal operation.
2. Upon the failure of a member, that members state will be changed to "Down/Dead", while all other members will
remain in "Active/Active Attention", and continure to process connections.
3. In New Mode, for the purposes of packet forwarding, each cluster address is associated with the corresponding
physical MAC address of the active member. For this reason, it is necessary to issue gratuitous ARP's during a failure.
This is not to be confused with the multicast MAC used by CCP for message transmission.
However, in Load Sharing mode, the multicast MAC used per segment by CCP, is also used as the MAC address for the
purposes of packet forwarding. This is necessary to ensure that each cluster member receives every packet. Therefore,
there will be no issuance of gratuitous ARP's for any cluster address during a failure.
4. Although CCP will advertise the configured priority of the sending cluster member, these priority labels do not dictate
a level of seniority in Load Sharing during normal operation as they do in New Mode configurations.

More Related Content

What's hot

Securing management, control & data plane
Securing management, control & data planeSecuring management, control & data plane
Securing management, control & data planeNetProtocol Xpert
 
OTV(Overlay Transport Virtualization)
OTV(Overlay  Transport  Virtualization)OTV(Overlay  Transport  Virtualization)
OTV(Overlay Transport Virtualization)NetProtocol Xpert
 
Operationalizing EVPN in the Data Center: Part 2
Operationalizing EVPN in the Data Center: Part 2Operationalizing EVPN in the Data Center: Part 2
Operationalizing EVPN in the Data Center: Part 2Cumulus Networks
 
Expl sw chapter_04_vtp-full
Expl sw chapter_04_vtp-fullExpl sw chapter_04_vtp-full
Expl sw chapter_04_vtp-fullaghacrom
 
MTU (maximum transmission unit) & MRU (maximum receive unit)
MTU (maximum transmission unit) & MRU (maximum receive unit)MTU (maximum transmission unit) & MRU (maximum receive unit)
MTU (maximum transmission unit) & MRU (maximum receive unit)NetProtocol Xpert
 
VXLAN BGP EVPN: Technology Building Blocks
VXLAN BGP EVPN: Technology Building BlocksVXLAN BGP EVPN: Technology Building Blocks
VXLAN BGP EVPN: Technology Building BlocksAPNIC
 
OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...
OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...
OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...OpenNebula Project
 
[Retired] routing on the host an introduction – cumulus networks® knowledge ...
[Retired] routing on the host  an introduction – cumulus networks® knowledge ...[Retired] routing on the host  an introduction – cumulus networks® knowledge ...
[Retired] routing on the host an introduction – cumulus networks® knowledge ...doudadouda
 
VXLAN Distributed Service Node
VXLAN Distributed Service NodeVXLAN Distributed Service Node
VXLAN Distributed Service NodeDavid Lapsley
 
06 evpn use-case_reviewv1
06 evpn use-case_reviewv106 evpn use-case_reviewv1
06 evpn use-case_reviewv1ronsito
 
Mlag invisibile layer 2 redundancy
Mlag invisibile layer 2 redundancyMlag invisibile layer 2 redundancy
Mlag invisibile layer 2 redundancyCumulus Networks
 
VLAN Trunking Protocol (VTP)
VLAN Trunking Protocol (VTP)VLAN Trunking Protocol (VTP)
VLAN Trunking Protocol (VTP)Naveen Soni
 
vPC techonology for full ha from dc core to baremetel server.
vPC techonology for full ha from dc core to baremetel server.vPC techonology for full ha from dc core to baremetel server.
vPC techonology for full ha from dc core to baremetel server.Ajeet Singh
 

What's hot (20)

EMEA Airheads- Switch stacking_ ArubaOS Switch
EMEA Airheads- Switch stacking_ ArubaOS SwitchEMEA Airheads- Switch stacking_ ArubaOS Switch
EMEA Airheads- Switch stacking_ ArubaOS Switch
 
Securing management, control & data plane
Securing management, control & data planeSecuring management, control & data plane
Securing management, control & data plane
 
OTV(Overlay Transport Virtualization)
OTV(Overlay  Transport  Virtualization)OTV(Overlay  Transport  Virtualization)
OTV(Overlay Transport Virtualization)
 
Operationalizing EVPN in the Data Center: Part 2
Operationalizing EVPN in the Data Center: Part 2Operationalizing EVPN in the Data Center: Part 2
Operationalizing EVPN in the Data Center: Part 2
 
Expl sw chapter_04_vtp-full
Expl sw chapter_04_vtp-fullExpl sw chapter_04_vtp-full
Expl sw chapter_04_vtp-full
 
Vlan
VlanVlan
Vlan
 
MTU (maximum transmission unit) & MRU (maximum receive unit)
MTU (maximum transmission unit) & MRU (maximum receive unit)MTU (maximum transmission unit) & MRU (maximum receive unit)
MTU (maximum transmission unit) & MRU (maximum receive unit)
 
Xpress path vxlan_bgp_evpn_appricot2019-v2_
Xpress path vxlan_bgp_evpn_appricot2019-v2_Xpress path vxlan_bgp_evpn_appricot2019-v2_
Xpress path vxlan_bgp_evpn_appricot2019-v2_
 
VXLAN BGP EVPN: Technology Building Blocks
VXLAN BGP EVPN: Technology Building BlocksVXLAN BGP EVPN: Technology Building Blocks
VXLAN BGP EVPN: Technology Building Blocks
 
OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...
OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...
OpenNebulaConf2018 - Scalable L2 overlay networks with routed VXLAN / BGP EVP...
 
EMEA Airheads- LACP and distributed LACP – ArubaOS Switch
EMEA Airheads- LACP and distributed LACP – ArubaOS SwitchEMEA Airheads- LACP and distributed LACP – ArubaOS Switch
EMEA Airheads- LACP and distributed LACP – ArubaOS Switch
 
[Retired] routing on the host an introduction – cumulus networks® knowledge ...
[Retired] routing on the host  an introduction – cumulus networks® knowledge ...[Retired] routing on the host  an introduction – cumulus networks® knowledge ...
[Retired] routing on the host an introduction – cumulus networks® knowledge ...
 
VXLAN Distributed Service Node
VXLAN Distributed Service NodeVXLAN Distributed Service Node
VXLAN Distributed Service Node
 
VXLAN
VXLANVXLAN
VXLAN
 
06 evpn use-case_reviewv1
06 evpn use-case_reviewv106 evpn use-case_reviewv1
06 evpn use-case_reviewv1
 
Mlag invisibile layer 2 redundancy
Mlag invisibile layer 2 redundancyMlag invisibile layer 2 redundancy
Mlag invisibile layer 2 redundancy
 
VLAN Trunking Protocol (VTP)
VLAN Trunking Protocol (VTP)VLAN Trunking Protocol (VTP)
VLAN Trunking Protocol (VTP)
 
vPC techonology for full ha from dc core to baremetel server.
vPC techonology for full ha from dc core to baremetel server.vPC techonology for full ha from dc core to baremetel server.
vPC techonology for full ha from dc core to baremetel server.
 
VTP
VTPVTP
VTP
 
EMEA Airheads- Virtual Switching Framework- Aruba OS Switch
EMEA Airheads- Virtual Switching Framework- Aruba OS SwitchEMEA Airheads- Virtual Switching Framework- Aruba OS Switch
EMEA Airheads- Virtual Switching Framework- Aruba OS Switch
 

Similar to Cluster control protocol_reference

VLAN Trunking Protocol
VLAN Trunking ProtocolVLAN Trunking Protocol
VLAN Trunking ProtocolNetwax Lab
 
LiveAction Spanning Tree Protocol (STP) Application Note
LiveAction Spanning Tree Protocol (STP) Application NoteLiveAction Spanning Tree Protocol (STP) Application Note
LiveAction Spanning Tree Protocol (STP) Application NoteActionPacked Networks
 
configuration of switch campus network
configuration of switch campus networkconfiguration of switch campus network
configuration of switch campus networksubhash subbu
 
configuring_cisco_stackwise_virtual.pdf
configuring_cisco_stackwise_virtual.pdfconfiguring_cisco_stackwise_virtual.pdf
configuring_cisco_stackwise_virtual.pdfAbdulfattahAssad
 
Creating ethernet vla ns on catalyst switches
Creating ethernet vla ns on catalyst switchesCreating ethernet vla ns on catalyst switches
Creating ethernet vla ns on catalyst switchesabeforu
 
Cisco discovery drs ent module 3 - v.4 in english.
Cisco discovery   drs ent module 3 - v.4 in english.Cisco discovery   drs ent module 3 - v.4 in english.
Cisco discovery drs ent module 3 - v.4 in english.igede tirtanata
 
Encor chapter 1_packet forwarding
Encor chapter 1_packet forwardingEncor chapter 1_packet forwarding
Encor chapter 1_packet forwardingmerhatsidikmelke
 
CCNP Switching Chapter 3
CCNP Switching Chapter 3CCNP Switching Chapter 3
CCNP Switching Chapter 3Chaing Ravuth
 
Branching out with SDN
Branching out with SDNBranching out with SDN
Branching out with SDNAPNIC
 
Presentation on ccna
Presentation on ccnaPresentation on ccna
Presentation on ccnaRahul Kumar
 
Presentation on ccna
Presentation on ccnaPresentation on ccna
Presentation on ccnaRahul Kumar
 
2.3.1.5 packet tracer configuring rapid pvst+ answer
2.3.1.5 packet tracer   configuring rapid pvst+ answer2.3.1.5 packet tracer   configuring rapid pvst+ answer
2.3.1.5 packet tracer configuring rapid pvst+ answerNarayana Samy
 
Day 5 VIRTUAL LANS
Day 5 VIRTUAL LANSDay 5 VIRTUAL LANS
Day 5 VIRTUAL LANSanilinvns
 
VIRTUAL LANS
VIRTUAL LANSVIRTUAL LANS
VIRTUAL LANSanilinvns
 

Similar to Cluster control protocol_reference (20)

VLAN Trunking Protocol
VLAN Trunking ProtocolVLAN Trunking Protocol
VLAN Trunking Protocol
 
ENCOR_Capitulo 4.pptx
ENCOR_Capitulo 4.pptxENCOR_Capitulo 4.pptx
ENCOR_Capitulo 4.pptx
 
LiveAction Spanning Tree Protocol (STP) Application Note
LiveAction Spanning Tree Protocol (STP) Application NoteLiveAction Spanning Tree Protocol (STP) Application Note
LiveAction Spanning Tree Protocol (STP) Application Note
 
configuration of switch campus network
configuration of switch campus networkconfiguration of switch campus network
configuration of switch campus network
 
configuring_cisco_stackwise_virtual.pdf
configuring_cisco_stackwise_virtual.pdfconfiguring_cisco_stackwise_virtual.pdf
configuring_cisco_stackwise_virtual.pdf
 
Mcserviceguard2
Mcserviceguard2Mcserviceguard2
Mcserviceguard2
 
ENCOR_Capitulo 5.pptx
ENCOR_Capitulo 5.pptxENCOR_Capitulo 5.pptx
ENCOR_Capitulo 5.pptx
 
Creating ethernet vla ns on catalyst switches
Creating ethernet vla ns on catalyst switchesCreating ethernet vla ns on catalyst switches
Creating ethernet vla ns on catalyst switches
 
Cisco discovery drs ent module 3 - v.4 in english.
Cisco discovery   drs ent module 3 - v.4 in english.Cisco discovery   drs ent module 3 - v.4 in english.
Cisco discovery drs ent module 3 - v.4 in english.
 
Encor chapter 1_packet forwarding
Encor chapter 1_packet forwardingEncor chapter 1_packet forwarding
Encor chapter 1_packet forwarding
 
CCNP Switching Chapter 3
CCNP Switching Chapter 3CCNP Switching Chapter 3
CCNP Switching Chapter 3
 
Virtual lan
Virtual lanVirtual lan
Virtual lan
 
Branching out with SDN
Branching out with SDNBranching out with SDN
Branching out with SDN
 
Presentation on ccna
Presentation on ccnaPresentation on ccna
Presentation on ccna
 
Presentation on ccna
Presentation on ccnaPresentation on ccna
Presentation on ccna
 
2.3.1.5 packet tracer configuring rapid pvst+ answer
2.3.1.5 packet tracer   configuring rapid pvst+ answer2.3.1.5 packet tracer   configuring rapid pvst+ answer
2.3.1.5 packet tracer configuring rapid pvst+ answer
 
Day 5 VIRTUAL LANS
Day 5 VIRTUAL LANSDay 5 VIRTUAL LANS
Day 5 VIRTUAL LANS
 
VIRTUAL LANS
VIRTUAL LANSVIRTUAL LANS
VIRTUAL LANS
 
3 2
3 23 2
3 2
 
Lesson 2 slideshow
Lesson 2 slideshowLesson 2 slideshow
Lesson 2 slideshow
 

Recently uploaded

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Cluster control protocol_reference

  • 1. Cluster Control Protocol Reference NG FP3 For additional technical information about Check Point products, consult Check Point’s SecureKnowledge database at http://support.checkpoint.com/kb/
  • 2. Preface Introduction This document explores various technical aspects of the Cluster Control Protocol as utilized by ClusterXL. Although parts of the Cluster Control Protocol are also used by OPSEC High Availability products, this aspect will not be covered. This document is not meant as an installation guide, and assumes the reader has a working knowledge of the ClusterXL product. Overview The introduction of ClusterXL into an existing network, often implies that certain changes to that network be made. Such alterations are needed to accommodate both the clustered topology, as well as the Cluster Control Protocol itself. Understanding these changes is important for the purposes of planning, implementation, monitoring, and troubleshooting. Moreover, adequate comprehension of the ClusterXL decision making process is needed. Otherwise, there will be no context in which to place observed cluster behavior. Therefore, the enclosed sections will explore the following topics: s Implementation Planning s Cluster Control Protocol Overview s Cluster Control Protocol Logic
  • 3. Implementation Planning NOTE: Due to the enhanced Hot/Standby configuration available in FP3, Legacy HA will not be covered in the following sections. High Availability (New Mode) FP3 introduces a new form of operation for High Availability. Simply referred to as "New Mode", it offers all the topology advantages of Load Sharing, while maintaining a Hot/Standby orientation. Important factors to consider while planning for a New Mode implementation are: s Switch support/configuration for layer two multicast forwarding s VLAN configuration s IP address migration s SmartCenter/CMA location Switch Support The Cluster Control Protocol used by both New Mode, and Load Sharing configurations, makes use of layer two multicast. In keeping with multicast standards, this multicast address is used only as the destination, and is used in all CCP packets sent on "non-secured" interfaces. A layer two switch connected to non-secured interfaces, must be capable of forwarding multicast packets to ports within that VLAN. It is acceptable that the switch forward such traffic to all ports within the given VLAN. However, it is considered more efficient to forward to only those ports connecting cluster members. The steps needed to enable multicast support will vary according to the switch vendor, and model. Please check your switch documentation for details. If the connecting switch is incapable of forwarding multicast, CCP can be changed to use broadcast instead. To toggle between these two modes use the command (mode survives a reboot): 'cphaconf set_ccp broadcast/multicast' VLAN Configuration It is not recommended to connect the non-secured interfaces of multiple clusters to the same VLAN. Doing so will cause the connecting switch ports to flap. If such a need exists, a separate VLAN, and/or switch will be needed for each cluster. A HotFix is also available from Check Point support which allows this configuration to be supported in FP3. Connecting together the secured interfaces of multiple clusters is also not recommended for the same reason. While the above mentioned HotFix may be used, there are additional concerns with this configuration which make it currently unsupportable. Therefore, it is best to connect the secured interfaces of a given cluster via a crossover link when possible, or to an isolated VLAN.
  • 4. IP Address Migration It is reasonable to assume that many ClusterXL installs will be either for a new VPN-1/FireWall-1 cluster, or to replace a different clustering solution. However, it is also reasonable to assume that many will be to provide high availability to an existing single gateway configuration. In the latter case, existing NAT, and IPSec connections will need to be altered to accommodate the new clustered landscape. Therefore, it is recommended to take the existing IP addresses from the current gateway, and make these the cluster owned VIP's, or cluster addresses when feasible. Doing so will avoid altering current IPSec endpoint identities, as well keep Hide NAT configurations the same in many cases. SmartCenter/CMA Location A SmartCenter/CMA Server, can install a Security Policy to one or more clusters with only a single install action. It does so by installing the Security Policy to each cluster member, using the general tab IP of each cluster member object as the recipient. This is true regardless of the IP address(es) of the cluster object itself. This design affords a great level of flexibility when using either New Mode, or Load Sharing configurations, as the SmartCenter/CMA Server can now reside on any given IP segment. One only needs to ensure that the general tab IP address of the cluster member object is reachable. If not, simply choose one which will be accessible to the SmartCenter/CMA Server.
  • 5. Load Sharing Important factors to consider while planning for a Load Sharing implementation are: s Switch support/configuration for layer two multicast forwarding s Router support for multicast s VLAN configuration s IP address migration s SmartCenter/CMA location Switch Support The Cluster Control Protocol used by both New Mode, and Load Sharing configurations, makes use of layer two multicast. In keeping with multicast standards, this multicast address is used only as the destination, and is used in all CCP packets sent on "non-secured" interfaces. A layer two switch connected to non-secured interfaces, must be capable of forwarding multicast packets to ports within that VLAN. It is acceptable that the switch forward such traffic to all ports within the given VLAN. However, it is considered more efficient to forward to only those ports connecting cluster members. The steps needed to enable multicast support will vary according to the switch vendor, and model. Please check your switch documentation for details. If the connecting switch is incapable of forwarding multicast, CCP can be changed to use broadcast instead. To toggle between these two modes use the command (mode survives a reboot): 'cphaconf set_ccp broadcast/multicast' Router Support In addition to the use of multicast by CCP, Load Sharing associates a multicast MAC for each configured cluster IP. This design ensures that traffic destined to the cluster is received by all members. Therefore, ARP replies sent by a cluster member will indicate that the unicast cluster IP, is reachable via a multicast MAC. Some routing devices are incapable of receiving such ARP replies. For instance, all versions of Cisco IOS do not include such support. In such cases, adding a static ARP entry for the cluster IP on the routing device will solve the issue. Even still there are some routers such as Extreme routers, Avia routers, and some Nortel models (Passport 1200 or XLR) which will not accept this type of static ARP entry. For those cases, ClusterXL FP4 will introduce a new mode of operation referred to as Pivot mode. Pivot mode operates as a Load Sharing cluster, but without the need of multicast for the cluster addresses. VLAN Configuration It is not recommended to connect the non-secured interfaces of multiple clusters to the same VLAN. Doing so will cause the connecting switch ports to flap. If such a need exists, a separate VLAN, and/or switch will be needed for each cluster. A HotFix is also available from Check Point support which allows this configuration to be supported. Connecting together the secured interfaces of multiple clusters is also not recommended for the same reason. While the above mentioned HotFix may be used, there are additional concerns with this configuration which make it currently unsupportable. Therefore, it is best to connect the secured interfaces of a given cluster via a crossover link when possible, or to an isolated VLAN.
  • 6. IP Address Migration It is reasonable to assume that many ClusterXL installs will be either for a new VPN-1/FireWall-1 cluster, or to replace a different clustering solution. However, it is also reasonable to assume that many will be to provide high availability to an existing single gateway configuration. In the ladder case, existing NAT, and IPSec connections will need to be altered to accommodate the new clustered landscape. Therefore, it is recommended to take the existing IP addresses from the current gateway, and make these the cluster owned VIP's, or cluster addresses when feasible. Doing so will avoid altering current IPSec endpoint identities, as well keep Hide NAT configurations the same in many cases. SmartCenter/CMA Location A SmartCenter/CMA Server, can install a Security Policy to one or more clusters with only a single install action. It does so by installing the Security Policy to each cluster member, using the general tab IP of each cluster member object as the recipient. This is true regardless of the IP address(es) of the cluster object itself. This design affords a great level of flexibility when using either New Mode, or Load Sharing configurations, as the SmartCenter/CMA Server can reside on any given IP segment. One only needs to ensure that the general tab IP address of the cluster member object is reachable. If not, simply choose one which will be accessible to the SmartCenter/CMA Server.
  • 7. Cluster Control Protocol CCP Overview The Cluster Control Protocol serves an integral role to the operation of ClusterXL. Specifically, CCP is responsible for the following: s Health status reports s Cluster member probing s State change commands s Querying for cluster membership s Sate table synchronization Health Status Reports CCP will report the status of a cluster member roughly three times a second, per interface. These reports contain state of the transmitting cluster member, as well as the presumed state of other cluster members. Cluster Member Probing If a cluster member fails to receive status for another member on a given segment, CCP will probe that segment in an attempt to illicit a response. The purpose of such probes is to detect the nature of possible interface failures, and to determine which module has the problem. The outcome of this probe will determine what action is taken next. State Change Commands If a cluster member wishes to change state, the command to do so takes place on the defined secured interface. Querying Cluster Membership When a cluster member comes online, such as with a reboot, it will send as series of CCP query/response messages to gain knowledge of it's cluster membership. State Table Synchronization When state synchronization is enabled, connection information is updated between cluster members on the defined secured interface.
  • 8. CCP Message Format The Cluster Control Protocol payload is made up a general heading, and one of a series of message types, with each having it's own unique purpose, format, and content. General Heading This portion contains information necessary for the processing of the encapsulated message type, the most important of which is: s Cluster ID - unique identifier shared amongst all members of a given cluster s Protocol Version - version and Feature Pack revision s Source Interface - transmitting interface number as recognized by the OS kernel s Source Machine ID - member identification according to configured priority. Calculated as priority -1=ID s Policy ID - Last two bytes of MD4 Policy ID Message Types Below is a complete listing of the possible CCP message types with description: s FWHA_MY_STATE - Report source machine's state s FWHA_Query_STATE - Query other machine's state s FWHA_IF_PROBE_REQ - Interface active check request s FWHA_IF_PROBE_RPLY - Interface active check reply s FWHA_IFCONF_REQ - Interface configuration request s FWHA_IFCONF_REPLY - Interface configuration reply s FWHA_POLICY_CHANGE - Policy ID change request/notification s FWHAP_SYNC - New Sync packet
  • 9. CCP Transmission Non-Secured Interfaces For interfaces not defined as secured (non synchronization interfaces), CCP transmits it's packets by default with layer two multicast. The addressable fields are as follows: s Source MAC - 00:00:00:00:fe:<Source Machine ID> s Source IP - 0.0.0.0 s Destination MAC - 01:00:5e:<cluster IP concatenation of bits 9-24> s Destination IP - network broadcast address As an example, lets assume a scenario in which New Mode is being used. On a given segment, the cluster IP is 10.3.220.103/27. CCP packets sent by the highest priority machine will look like this: 0:0:0:0:fe:0 1:0:5e:3:dc:67 ip 78: 0.0.0.0 > 10.3.220.96 Here, 1:0:5e:3:dc:67 corresponds to the destination MAC, and indicates that the OID is multicast (1:0:5e:), with the rest corresponding to the last three octets of the cluster address. The last octet of the source address 0:0:0:0:fe:0, indicates the Machine ID of the transmitting member, in this case this primary. In case of a second, or third member, this source address will reflect the members priority such as: 0:0:0:0:fe:1 0:0:0:0:fe:2 Using this design, both the destination MAC, and destination IP address will change per IP segment, according to both the cluster, and network address. Secured Interfaces For interfaces defined as secured (synchronization interfaces), CCP transmits by default as follows: s Source MAC - 00:00:00:00:fe:<Source Machine ID> s Source IP - 0.0.0.0 s Destination MAC - ff:ff:ff:ff:ff:ff (all hosts broadcast) s Destination IP - network broadcast address Port usage CCP uses UDP as the transmission protocol, with both the source and destination port set to 8116. This is true irrespective of the interface type.
  • 10. ClusterXL Decision Logic Topology The following section will explore the logic utilized by ClusterXL. We will do so by looking at several common failures, and how ClusterXL responds to such scenarios. A separate section will be dedicated to both New Mode High Availability, and to Load Sharing configurations. Note: The following examples are given in general terms, and do not represent a per packet analysis. The topology represented by Figure 1.1 will be assumed. Figure 1.1 Topology Legend s Dallab_Cluster - ClusterXL cluster s P1_Primary - Managing CMA s Net_10.3.220.96 - External segment s Net_10.2.220.96 - Admin network s Net_10.1.220.96 - Corporate network s Net_192.168.250.8 - Sync (Secured) network
  • 11. High Availability (New Mode) Interface Failure This scenario assumes two cluster members, in which the external interface of the primary has failed. 1. At the point of failure, the primary will recognize that no CCP messages have been heard on the failed interface. As such, it will announce via FWHA_MY_STATE on all other segments, that there may be an issue in the inbound direction with one of it's interfaces. 2. The primary will also note that no CCP responses have been received on the failed interface. This causes the primary to then announce on all other segments via FWHA_MY_STATE, that the outbound direction for one of the it's interfaces is in question as well. 3. At the same time as the above events, the secondary will recognize that no CCP packets have been received, and begins sending FWHA_PROBE_REQ messages on the affected segment. In addition, the secondary will attempt ARP requests to hosts belonging to the affected segment, and will begin pinging those hosts which respond. This is done in an attempt to diagnose which member has the problem. The pings will continue as long as we cannot identify by other means (i.e. CCP packets) that the interface is alive. This will happen when there are N cluster members, and N-1 of them are down. When more than two members are present, such pings will only be issued if all other cluster members do not respond to CCP probing. 4. Since no FWHA_PROBE_RPLY message is received as a response, but the ping requests are being answered, the secondary concludes that it's own interfaces are up and working, and that the interface of the primary has failed. Therefore, it announces via FWHA_MY_STATE, that all of it's own interfaces are operational. 5. With this report from the secondary, the primary concludes the issue is with it's own interface, and move to the "Down/Dead" status. 6. The secondary issues gratuitous ARP's for both the physical, and cluster address per IP segment, and moves to the "Active/Active-Attention" state.
  • 12. Primary Reboot 1. As the primary goes down, it changes it's state to "Down/dead", and announces this as part of FWHA_MY_STATE. 2. This triggers the secondary to prepare itself to become the active member. It does so by sending as series of gratuitous ARP's for both it's physical IP, and cluster IP for each clustered segment. This will update all necessary hosts/routers on each segment with the relevant updated MAC address information. 3. The secondary now moves to the "Active/Active-Attention" state, and assumes responsibility for processing all connections. 4. Though the primary is now considered "Down/dead", it will still be able to send/receive CCP packets until its' interfaces are brought down. Once this occurs, the secondary, which is now in the "Active/Active-Attention" state, will make notice of the fact that no CCP packets are being received. 5. The secondary will do several things in an effort to ascertain why no CCP packets are being received. First, it sends FWHA_IF_PROB_REQ packets on all segments in which no CCP packets have been heard. This is to illicit a response from any member capable of responding. The secondary will ARP on each segment for IP's belonging to that segment, and ping those hosts which respond. The pings will continue as long as we cannot identify by other means (i.e. CCP packets) that the interface is alive. This will happen when there are N cluster members, and N-1 of them are down. When more than two members are present, such pings will only be issued if all other cluster members do not respond to CCP probing. 6.Once the primary has rebooted, but before the policy is loaded, is will begin sending FWHA_IFCONF_REPLY packets regularly. It does so without knowing what Cluster it belongs to, so a random ID is used. 7. After the Primary learns the cluster ID, it begins announcing FWHA_MY_STATE. The primary at this stage announces itself as "Down/Dead". 8. The primary fetches the policy from another cluster member if possible, otherwise from the management server. 9. The primary now initiates full synchronization on the secured interface via the FW1 protocol 10. Once synchronization is complete, the primary moves to the "Ready" state. 11. The secondary acknowledges this by moving to the "Standby" state. 12.Once this state has been acknowledged, the primary issues gratuitous ARP's for both the physical, and cluster IP for each segment, and now moves to the "Active/Active-Attention" state.
  • 13. Registered Device Failure This scenario assumes two cluster members, in which the fwd daemon has failed on the primary. 1. Once the fwd daemon has died, this is detected by Cluster XL as the device is no longer reporting state. The primary changes it's state to "Down/dead", and announces this as part of FWHA_MY_STATE. 2. This triggers the secondary to prepare itself to become the active member. It does so by sending as series of gratuitous ARP's for both the physical, and cluster IP for each segment. This will update all necessary hosts/routers on each segment with the relevant updated MAC address information. 3. The secondary now moves to the "Active/Active-Attention" state, and assumes responsibility for processing all connections. In this case, the primary is still able to send CCP hello packets, and will continue to do so. Because of this, the secondary will not make any attempts to diagnose interface related issues such as the pinging of hosts. This differs from an interface failure where CCP messages would not be received, which would trigger such a diagnosis using our topology.
  • 14. Dual Failure This scenario assumes a dual failure by both cluster members of the secured (synchronization) interface connected via a crossover link. 1. Since it is assumed that the secured interfaces are connected via a crossover link, the failure of one interface will bring down the line protocol of the other resulting in a dual failure. Once this occurs, both members will become aware of this fact via CCP. Both members will announce as part of FWHA_MY_STATE that N-1 interfaces are up. 2. Since both members have suffered the loss of a single interface, a decision must be made as to what action to take next. Bringing both members down will result in a total failure, but some level of disturbance as already occurred. The solution is for the highest priority member to remain in the "Active/Active Attention" state, and for the secondary to report itself as "Down/Dead". 3. The necessary state changes are made, and announced as part of FWHA_MY_STATE. 4. Upon recovery of the secured link, the secondary will resume the "Standby" status.
  • 15. Load Sharing The events carried out by CCP during various failures in Load Sharing mode, closely resembles those covered thus far in the previous New Mode section. For this reason, a complete analysis will not be given. However, there are some important differences which should be noted. 1. As opposed to New Mode, all members of a Load Sharing cluster will remain in the "Active/Active Attention" state during normal operation. 2. Upon the failure of a member, that members state will be changed to "Down/Dead", while all other members will remain in "Active/Active Attention", and continure to process connections. 3. In New Mode, for the purposes of packet forwarding, each cluster address is associated with the corresponding physical MAC address of the active member. For this reason, it is necessary to issue gratuitous ARP's during a failure. This is not to be confused with the multicast MAC used by CCP for message transmission. However, in Load Sharing mode, the multicast MAC used per segment by CCP, is also used as the MAC address for the purposes of packet forwarding. This is necessary to ensure that each cluster member receives every packet. Therefore, there will be no issuance of gratuitous ARP's for any cluster address during a failure. 4. Although CCP will advertise the configured priority of the sending cluster member, these priority labels do not dictate a level of seniority in Load Sharing during normal operation as they do in New Mode configurations.