High Availability Part 2
Expanding HA with Multi-WAN, Converting Existing Firewalls
July 2016 Hangout
Jim Pingle
About this Hangout
● Project News
● Brief HA Review
● HA with Multi-WAN Overview
● Adding a WAN to an existing HA cluster
● Converting an existing system to HA
● HA Testing and Troubleshooting review
Project News
●
Chris Buechler is leaving the project
– https://blog.pfsense.org/?p=2095
●
2.3.2 is out!
– Bug fixes, a couple small features
– Traffic totals package, ntopng
– https://blog.pfsense.org/?p=2108
●
pfSense License changed to Apache License 2.0
– BSD Compatible, More clear on copyright, patent, and trademark
– https://blog.pfsense.org/?p=2103
●
Official pfSense Facebook Page
– https://facebook.com/pfsense
● New Official pfSense Facebook Group
– https://www.facebook.com/groups/pfsense.official
Cluster Overview
Internet
WAN Switch
LAN Switch
Sync Interface
WAN
198.51.100.12
WAN
198.51.100.13
Shared WAN CARP IP
198.51.100.11
LAN
10.11.0.2
LAN
10.11.0.3
Shared LAN CARP IP
10.11.0.1
10.11.2.2 10.11.2.3
Primary Secondary
WAN1 Switch
LAN Switch
Sync Interface
WAN1
198.51.100.12
WAN1
198.51.100.13
Shared WAN1
CARP VIP
198.51.100.11
LAN
10.11.0.2
LAN
10.11.0.3
Shared LAN CARP IP
10.11.0.1
10.11.2.2 10.11.2.3
Primary Secondary
WAN2 Switch
ISP 1 ISP 2
Shared WAN2
CARP VIP
203.0.113.11
WAN2
203.0.113.12
WAN2
203.0.113.13
DMZ Switch
Shared DMZ CARP IP
10.11.1.1
DMZ
10.11.1.2 DMZ
10.11.1.3
HA Review
● Two units: Primary node, Secondary node
– Working together to private active/passive failover
● IP Address Redundancy (CARP)
– Heartbeats transmitted on interfaces with CARP VIPs (Not sync!)
– Traffic to cluster should route to CARP VIPs (Use for gateway, DNS, VPN etc)
– Traffic from cluster should originate from CARP VIPs (Outbound NAT)
● Configuration Sync (XMLRPC)
– Communicates via the Sync interface
– Copies settings from primary to secondary on save
● State Sync (pfsync)
– Communicates via the Sync interface
– State inserts, deletes, etc exchanged between nodes
– When primary fails, connections continue to flow through secondary
● Switches must properly handle multicast traffic
HA for Multi-WAN
● Both WANs must be connected to both units!
– There is no way to fail from one node to another based
on gateway status
– Not truly redundant because a failure of the primary
node and secondary WAN would result in a total outage,
as would a failure of the primary WAN and the
secondary node.
● Requires a proper switch in front of each WAN
– CPE integrated switches can have issues with multicast
in general, or CARP traffic specifically
HA for Multi-WAN
● Both WANs must have static IP addresses
– A /29 or larger is required
● Each node requires an IP address in the WAN subnet, plus a CARP
VIP
● Technically possible to do with a static /30 but not recommended
– DHCP and PPPoE WANs cannot be used directly for a proper
HA setup
● If the DHCP or PPPoE is handled by a separate device ahead of the
HA cluster, and pfSense is presented with a /29 or larger, then it can
work
● Not recommended if it involves a layer of NAT, but for an additional
WAN the extra redundancy may outweigh the negative factors
HA for Multi-WAN
● Gateway groups, policy routing, etc are mostly the same as single unit
– Choose the gateway on the WAN interface settings
– Create gateway group(s) using the gateways
● Pick the CARP VIP for each gateway in a group
– Setup DNS
– Use gateway groups in rules to configure failover or load balancing
● Add rules to pass traffic to local/VPN nets with rules that do not have a
gateway
– Place these rules at the top of the local interface tabs
● Outbound NAT required for both WANs, using CARP VIPs
– If converting an existing HA setup to Multi-WAN, the outbound NAT rules must
be added manually
Adding a WAN to an HA Cluster
●
Add WANx NIC to both, if needed
– If adding a physical NIC, power off secondary, add hardware, power on,
then do primary
– If adding a VLAN, nothing special, add the tag to both
●
Assign WANx NIC on both in the same order
● Enable the interface, name it WANx, configure the static addresses
●
Add a gateway and ensure it’s selected on Interfaces > WANx on
both units
●
Add a CARP VIP for WANx to the primary, will auto sync to
secondary
●
Add outbound NAT rules for WANx
Adding a WAN to an HA Cluster
● Create gateway groups, be sure to select the CARP VIPs for
each WAN
● Add firewall rules to WANx
● Add policy routing to LAN/DMZ rules as needed
● Check gateway/group status on both
● Check that rules, etc, synchronized
● If necessary, convert existing daemons and settings for use
with Multi-WAN
– VPNs, port forwards, etc.
– See the previous hangout for details
Converting to CARP
● Prepare for the conversion
● Make adjustments to the current firewall/primary
● Prepare the secondary
● Setup sync interface
● Configure pfsync
● Configure XMLRPC
● Setup DHCP
Converting to CARP – Initial Tasks
● Several tasks, starting with adjustments to current firewall
– Needs planning and maintenance window
– Will cause an outage
– See the previous hangout and book for info on planning address usage
● Rename current firewall to indicate its status as part of an ha cluster
– Ex: fw-a.example.com, fw-1.example.com, fw-pri.example.com
● Move current firewall (primary) to new IP addresses
● Add old IP addresses as CARP VIPs
● Add interface for sync traffic
– Add and/or assign NIC
– Configure it with a new sync network, static address, firewall rules
Converting to CARP – Initial Tasks
● Configure DHCP to give out CARP VIP for
gateway and DNS, if in use
– Failover IP address will be needed later but do not
set it yet!
● Configure outbound NAT to utilize the WAN
CARP VIP
● Configure VPNs to use CARP VIPs for their
Interface settings
Prepare the Secondary
●
Bring up another pfSense device isolated from the network
– Ideally identical hardware, or at least identical NICs
– Connect a PC/Laptop to its LAN, don’t connect WAN yet
●
Setup the basics on the unit
– Give it a relevant hostname, similar to the primary
● Ex: fw-b.example.com, fw-2.example.com, fw-sec.example.com
– Assign all interfaces in identical order to the primary node!
– Enable/Configure Interfaces
● Use the same names as the primary
● Interface IP addresses (WAN, LAN, sync) that do not conflict with the primary or VIPs
– Pass rules on sync and LAN interfaces
● Check that the GUI is running the same port/protocol as the primary
●
Set the same admin password as the primary
● Connect the secondary to the network
Setup Sync Interface
● Interface should already be assigned, enabled,
numbered by this point
● Add firewall rules for sync
– On Primary:
● Add rule to pass TCP/443 to on the Sync address for GUI
● Add rule to pass pfsync from Sync net to any.
● Optionally add a rule to pass ICMP echo to/from Sync net
– On Secondary:
● Add rule to pass any proto from Sync net to any
● Rule is different so it's obvious when it has been replaced by
config sync!
Configure pfsync
● System > High Avail Sync
● Enable on BOTH nodes
● Check Synchronize States
● Set Synchronize Interface to Sync
● Set the pfsync Synchronize Peer IP to the IP address of the
other node
– Ex: On the Primary, enter the IP address of the secondary, 172.16.1.3,
and vice versa.
– Technically this setting is optional. Without it set, state sync is sent via
multicast rather than unicast. With only two nodes, unicast can be more
reliable
● Click Save
Configure XMLRPC
● System > High Avail Sync
● Enable only on the Primary node!
● Set Synchronize Config to IP to the secondary node's Sync interface IP (e.g.
172.16.1.3)
● Set Remote System Username to admin
– Must use admin, no other user will work
● Set Remote System Password to the admin password
● Check the boxes for each area to synchronize
● Click Save
● CARP VIPs, rules, NAT, and other changes will now sync to the secondary
● On the secondary, check and see if the rule changes on the Sync interface carried
over
● From here on, do not make any changes on the secondary in an area that will
sync
Setup DHCP
● Services > DHCP Server, LAN tab on Primary
● Set the Failover Peer IP to the actual LAN IP address
of the secondary node, e.g. 192.168.1.3
● Click Save
● Repeat for additional local interfaces if necessary
● Gateway must be the CARP VIP, DNS if using the
firewall for DNS also
● Status > DHCP Leases, check pool status, should be
“normal”/”normal”
Testing
● Verify that a client on the LAN can pass through the cluster
● Verify XMLRPC by checking if a setting syncs, and via Status > Filter Reload, Force
Config Sync
● Verify CARP by checking Status > CARP
● Verify state sync by checking pfsync nodes on Status > CARP and contents of Diag
> States
● Testing Failover:
– Status > CARP, disable CARP on primary
– Check status on secondary, should now be MASTER
– Test LAN client connectivity, DHCP, etc
– Enable CARP on Primary
– Retest connectivity
● Downloading a file, streaming audio, or streaming video will most likely continue
uninterrupted. VoIP-based phone calls may have a slight disruption as they are not
buffered like the others.
Troubleshooting
● Review the configuration
● Check CARP status, if VIP is INIT, check interface link, edit/save/apply VIP
● Check for conflicting VHIDs
● Check subnet mask on CARP VIPs
● Switch/L2 issues
– Ensure boxes are on the correct switch/VLAN/L2
– Try to ping between the nodes on the affected interface
– Ensure switches are properly trunking, if applicable
– Try another switch (especially if using a modem/CPE switch)
– Disable IGMP snooping, broadcast/multicast storm control, etc.
● Check system logs, firewall logs, notices
Upgrading
● Review changelog/blog/upgrade guide
● Take a backup
– Cannot stress this enough
● Upgrade secondary
● Test secondary
● Switch CARP to maintenance mode on primary
● Upgrade primary
● Exit maintenance mode on primary
● Test again
Conclusion
● More detail on the specifics of Multi-WAN can
be found in the March 2016 Hangout
● More detail on the basics of CARP can be
found in the June 2015 Hangout
● Questions?
● Ideas for hangout topics? Post on forum,
comment on the blog posts, Reddit, etc

High Availability Part 2 - pfSense Hangout July 2016

  • 1.
    High Availability Part2 Expanding HA with Multi-WAN, Converting Existing Firewalls July 2016 Hangout Jim Pingle
  • 2.
    About this Hangout ●Project News ● Brief HA Review ● HA with Multi-WAN Overview ● Adding a WAN to an existing HA cluster ● Converting an existing system to HA ● HA Testing and Troubleshooting review
  • 3.
    Project News ● Chris Buechleris leaving the project – https://blog.pfsense.org/?p=2095 ● 2.3.2 is out! – Bug fixes, a couple small features – Traffic totals package, ntopng – https://blog.pfsense.org/?p=2108 ● pfSense License changed to Apache License 2.0 – BSD Compatible, More clear on copyright, patent, and trademark – https://blog.pfsense.org/?p=2103 ● Official pfSense Facebook Page – https://facebook.com/pfsense ● New Official pfSense Facebook Group – https://www.facebook.com/groups/pfsense.official
  • 4.
    Cluster Overview Internet WAN Switch LANSwitch Sync Interface WAN 198.51.100.12 WAN 198.51.100.13 Shared WAN CARP IP 198.51.100.11 LAN 10.11.0.2 LAN 10.11.0.3 Shared LAN CARP IP 10.11.0.1 10.11.2.2 10.11.2.3 Primary Secondary WAN1 Switch LAN Switch Sync Interface WAN1 198.51.100.12 WAN1 198.51.100.13 Shared WAN1 CARP VIP 198.51.100.11 LAN 10.11.0.2 LAN 10.11.0.3 Shared LAN CARP IP 10.11.0.1 10.11.2.2 10.11.2.3 Primary Secondary WAN2 Switch ISP 1 ISP 2 Shared WAN2 CARP VIP 203.0.113.11 WAN2 203.0.113.12 WAN2 203.0.113.13 DMZ Switch Shared DMZ CARP IP 10.11.1.1 DMZ 10.11.1.2 DMZ 10.11.1.3
  • 5.
    HA Review ● Twounits: Primary node, Secondary node – Working together to private active/passive failover ● IP Address Redundancy (CARP) – Heartbeats transmitted on interfaces with CARP VIPs (Not sync!) – Traffic to cluster should route to CARP VIPs (Use for gateway, DNS, VPN etc) – Traffic from cluster should originate from CARP VIPs (Outbound NAT) ● Configuration Sync (XMLRPC) – Communicates via the Sync interface – Copies settings from primary to secondary on save ● State Sync (pfsync) – Communicates via the Sync interface – State inserts, deletes, etc exchanged between nodes – When primary fails, connections continue to flow through secondary ● Switches must properly handle multicast traffic
  • 6.
    HA for Multi-WAN ●Both WANs must be connected to both units! – There is no way to fail from one node to another based on gateway status – Not truly redundant because a failure of the primary node and secondary WAN would result in a total outage, as would a failure of the primary WAN and the secondary node. ● Requires a proper switch in front of each WAN – CPE integrated switches can have issues with multicast in general, or CARP traffic specifically
  • 7.
    HA for Multi-WAN ●Both WANs must have static IP addresses – A /29 or larger is required ● Each node requires an IP address in the WAN subnet, plus a CARP VIP ● Technically possible to do with a static /30 but not recommended – DHCP and PPPoE WANs cannot be used directly for a proper HA setup ● If the DHCP or PPPoE is handled by a separate device ahead of the HA cluster, and pfSense is presented with a /29 or larger, then it can work ● Not recommended if it involves a layer of NAT, but for an additional WAN the extra redundancy may outweigh the negative factors
  • 8.
    HA for Multi-WAN ●Gateway groups, policy routing, etc are mostly the same as single unit – Choose the gateway on the WAN interface settings – Create gateway group(s) using the gateways ● Pick the CARP VIP for each gateway in a group – Setup DNS – Use gateway groups in rules to configure failover or load balancing ● Add rules to pass traffic to local/VPN nets with rules that do not have a gateway – Place these rules at the top of the local interface tabs ● Outbound NAT required for both WANs, using CARP VIPs – If converting an existing HA setup to Multi-WAN, the outbound NAT rules must be added manually
  • 9.
    Adding a WANto an HA Cluster ● Add WANx NIC to both, if needed – If adding a physical NIC, power off secondary, add hardware, power on, then do primary – If adding a VLAN, nothing special, add the tag to both ● Assign WANx NIC on both in the same order ● Enable the interface, name it WANx, configure the static addresses ● Add a gateway and ensure it’s selected on Interfaces > WANx on both units ● Add a CARP VIP for WANx to the primary, will auto sync to secondary ● Add outbound NAT rules for WANx
  • 10.
    Adding a WANto an HA Cluster ● Create gateway groups, be sure to select the CARP VIPs for each WAN ● Add firewall rules to WANx ● Add policy routing to LAN/DMZ rules as needed ● Check gateway/group status on both ● Check that rules, etc, synchronized ● If necessary, convert existing daemons and settings for use with Multi-WAN – VPNs, port forwards, etc. – See the previous hangout for details
  • 11.
    Converting to CARP ●Prepare for the conversion ● Make adjustments to the current firewall/primary ● Prepare the secondary ● Setup sync interface ● Configure pfsync ● Configure XMLRPC ● Setup DHCP
  • 12.
    Converting to CARP– Initial Tasks ● Several tasks, starting with adjustments to current firewall – Needs planning and maintenance window – Will cause an outage – See the previous hangout and book for info on planning address usage ● Rename current firewall to indicate its status as part of an ha cluster – Ex: fw-a.example.com, fw-1.example.com, fw-pri.example.com ● Move current firewall (primary) to new IP addresses ● Add old IP addresses as CARP VIPs ● Add interface for sync traffic – Add and/or assign NIC – Configure it with a new sync network, static address, firewall rules
  • 13.
    Converting to CARP– Initial Tasks ● Configure DHCP to give out CARP VIP for gateway and DNS, if in use – Failover IP address will be needed later but do not set it yet! ● Configure outbound NAT to utilize the WAN CARP VIP ● Configure VPNs to use CARP VIPs for their Interface settings
  • 14.
    Prepare the Secondary ● Bringup another pfSense device isolated from the network – Ideally identical hardware, or at least identical NICs – Connect a PC/Laptop to its LAN, don’t connect WAN yet ● Setup the basics on the unit – Give it a relevant hostname, similar to the primary ● Ex: fw-b.example.com, fw-2.example.com, fw-sec.example.com – Assign all interfaces in identical order to the primary node! – Enable/Configure Interfaces ● Use the same names as the primary ● Interface IP addresses (WAN, LAN, sync) that do not conflict with the primary or VIPs – Pass rules on sync and LAN interfaces ● Check that the GUI is running the same port/protocol as the primary ● Set the same admin password as the primary ● Connect the secondary to the network
  • 15.
    Setup Sync Interface ●Interface should already be assigned, enabled, numbered by this point ● Add firewall rules for sync – On Primary: ● Add rule to pass TCP/443 to on the Sync address for GUI ● Add rule to pass pfsync from Sync net to any. ● Optionally add a rule to pass ICMP echo to/from Sync net – On Secondary: ● Add rule to pass any proto from Sync net to any ● Rule is different so it's obvious when it has been replaced by config sync!
  • 16.
    Configure pfsync ● System> High Avail Sync ● Enable on BOTH nodes ● Check Synchronize States ● Set Synchronize Interface to Sync ● Set the pfsync Synchronize Peer IP to the IP address of the other node – Ex: On the Primary, enter the IP address of the secondary, 172.16.1.3, and vice versa. – Technically this setting is optional. Without it set, state sync is sent via multicast rather than unicast. With only two nodes, unicast can be more reliable ● Click Save
  • 17.
    Configure XMLRPC ● System> High Avail Sync ● Enable only on the Primary node! ● Set Synchronize Config to IP to the secondary node's Sync interface IP (e.g. 172.16.1.3) ● Set Remote System Username to admin – Must use admin, no other user will work ● Set Remote System Password to the admin password ● Check the boxes for each area to synchronize ● Click Save ● CARP VIPs, rules, NAT, and other changes will now sync to the secondary ● On the secondary, check and see if the rule changes on the Sync interface carried over ● From here on, do not make any changes on the secondary in an area that will sync
  • 18.
    Setup DHCP ● Services> DHCP Server, LAN tab on Primary ● Set the Failover Peer IP to the actual LAN IP address of the secondary node, e.g. 192.168.1.3 ● Click Save ● Repeat for additional local interfaces if necessary ● Gateway must be the CARP VIP, DNS if using the firewall for DNS also ● Status > DHCP Leases, check pool status, should be “normal”/”normal”
  • 19.
    Testing ● Verify thata client on the LAN can pass through the cluster ● Verify XMLRPC by checking if a setting syncs, and via Status > Filter Reload, Force Config Sync ● Verify CARP by checking Status > CARP ● Verify state sync by checking pfsync nodes on Status > CARP and contents of Diag > States ● Testing Failover: – Status > CARP, disable CARP on primary – Check status on secondary, should now be MASTER – Test LAN client connectivity, DHCP, etc – Enable CARP on Primary – Retest connectivity ● Downloading a file, streaming audio, or streaming video will most likely continue uninterrupted. VoIP-based phone calls may have a slight disruption as they are not buffered like the others.
  • 20.
    Troubleshooting ● Review theconfiguration ● Check CARP status, if VIP is INIT, check interface link, edit/save/apply VIP ● Check for conflicting VHIDs ● Check subnet mask on CARP VIPs ● Switch/L2 issues – Ensure boxes are on the correct switch/VLAN/L2 – Try to ping between the nodes on the affected interface – Ensure switches are properly trunking, if applicable – Try another switch (especially if using a modem/CPE switch) – Disable IGMP snooping, broadcast/multicast storm control, etc. ● Check system logs, firewall logs, notices
  • 21.
    Upgrading ● Review changelog/blog/upgradeguide ● Take a backup – Cannot stress this enough ● Upgrade secondary ● Test secondary ● Switch CARP to maintenance mode on primary ● Upgrade primary ● Exit maintenance mode on primary ● Test again
  • 22.
    Conclusion ● More detailon the specifics of Multi-WAN can be found in the March 2016 Hangout ● More detail on the basics of CARP can be found in the June 2015 Hangout ● Questions? ● Ideas for hangout topics? Post on forum, comment on the blog posts, Reddit, etc