Advertisement
Advertisement

More Related Content

Advertisement
Advertisement

High Availability on pfSense 2.4 - pfSense Hangout March 2017

  1. High Availability Using CARP, XMLRPC, and pfsync On pfSense 2.3/2.4 March 2017 Hangout Jim Pingle
  2. Project News ● pfSense 2.3.3-p1 is out! – A few beneficial improvements, security updates (OpenSSL, cURL) – https://www.netgate.com/blog/pfsense-2-3-3-p1-release-now-available.html ● pfSense Blog moved to Netgate: – https://www.netgate.com/blog/ – Primary reason was so we could get rid of WordPress, due to security concerns – Now uses a static site generated by Jekyll ● Coreboot (“BIOS”) upgrade for many SG-device owners (2220, 2440, 4860, 8860, XG-2758) – If you purchased one of these devices, you should have received an e-mail about the update – Workaround for Intel C2000 Errata AVR.58 ● Disables SERIRQ to prevent indeterminate interrupt behavior for systems that do not have external pull up resistor on SERIRQ PIN. – A package is now available to handle it automatically (install via System > Package Manager) ● Training in Europe – https://netgate.com/training/ – Will include a preview of our upcoming remote device management platform – Paris, FR - 21-22 Sep, 2017 – London, UK - 27-28 Sep, 2017 – Frankfurt, DE - 4-5 Oct, 2017 – Saint Petersburg, RU - 11-12 Oct, 2017 ● Documentation for Let’s Encrypt ACME package – https://doc.pfsense.org/index.php/ACME_package
  3. About this Hangout ● Components of a High Availability Cluster ● Prerequisites ● Configuration of a cluster from default config ● Testing ● Troubleshooting ● Upgrading
  4. Cluster Overview Internet WAN Switch LAN Switch Sync Interface WAN 198.51.100.201 2001:db8::201 WAN 198.51.100.202 2001:db8::202 Shared WAN CARP IP 198.51.100.200 2001:db8::200 LAN 192.168.1.2 2001:db8:1::2 LAN 192.168.1.3 2001:db8:1::3 Shared LAN CARP IP 192.168.1.1 2001:db8:1::1 172.16.1.2 2001:db8:1:1::2 172.16.1.3 2001:db8:1:1::3 Primary Secondary Internet WAN Switch LAN Switch WAN HA IP Address 198.51.100.200 2001:db8::200 LAN HA IP Address 192.168.1.1 2001:db8:1::1 Cluster Actual Layout Logical Layout
  5. HA Components ● Logically, the two nodes become a single unit from the perspective of the network ● IP Address Redundancy (CARP) – Traffic to/from cluster should use VIP addresses ● Configuration Sync (XMLRPC) – Keeps the node configurations similar ● State Sync (pfsync) – Shares state information between all nodes
  6. IP Address Redundancy (CARP) ● CARP VIPs are shared between cluster nodes ● Works similar to VRRP, can conflict with VRRP/HSRP ● Heartbeats transmitted on all interfaces containing CARP VIPs (NOT SYNC IF!) – Approx. 1 per second (base) + fraction (1/256th) of a sec (skew) – If a secondary node stops receiving heartbeats or they are too slow, it will take over as master – Skew adds time/slowness, secondary must use a higher skew (e.g. 100) than the primary (e.g. 0) – Active/Passive only, no Active/Active ● Traffic to cluster must route to CARP VIP address(es) – Routed inbound traffic, VPNs, port forwards, local gateway, DNS – Exceptions ● Only access firewall GUI/SSH by interface IP addresses, not VIP! ● Monitoring systems can check each node individually ● Traffic from cluster should originate from CARP VIPs – Outbound NAT, VPNs ● If an interface containing a VIP loses link, that node will automatically demote itself due to the problem (temporarily adds 240 to the skew) ● On pfSense, preemptive failover is enabled by default so if an interface triggers a demotion, all VIPs are demoted to trigger a complete failover
  7. Configuration Sync (XMLRPC) ● Communicates via the Sync interface ● Copies some settings from primary to secondary on save ● Does not sync interfaces or System > Advanced, System > General, or most packages. ● Will sync rules, NAT, aliases, VPNs, many other areas ● Not strictly required for HA, but makes the job much easier
  8. State Sync (pfsync) ● Communicates via the Sync interface ● State inserts, deletes, and updates exchanged between nodes ● State table on both nodes should be identical (or nearly so) ● If-bound states mean physical interface assignments must be identical (or LAGGs to mask differences) ● When primary fails, connections continue to flow through secondary since the state is already present ● Requires use of CARP VIPs with NAT, no traffic direct to/from node ● HA can work without it, but connections will be disrupted during failover ● Not currently compatible with Limiters
  9. Assumptions ● Two firewalls running pfSense software – Could use more, but that isn't covered here and does not offer a significant advantage ● Firewalls are at (or near) a default configuration – Conversion to HA from an existing install is possible, but can be tricky. See the July 2016 hangout for details ● Firewalls have identical interfaces assigned in identical order ● Firewalls have a non-conflicting configuration – Different IP addresses in the same networks – DHCP off on secondary until it is properly configured for HA
  10. Pick a Sync Interface ● One interface will interconnect between units for XMLRPC and pfsync traffic ● Name it the “Sync” interface ● Do not name it a “CARP interface”, no CARP/heartbeat traffic on it, it does not factor into failover directly! ● Can consume a significant amount of bandwidth for state synchronization, especially in environments with lots of state churn ● Technically optional but highly recommended – If no physical interface is available, use a VLAN – If no VLAN is possible, it could share a secure internal interface, but that can still be dangerous/insecure ● pfsync has no authentication, any device on the segment could insert state data
  11. Interface Assignments ● Nodes must have the same number of interfaces and they must be assigned in identical order. ● The order of interfaces on Interfaces > Assignments must be identical! ● For state synchronization to work, interfaces must be the same type as well (e.g. igbX, emX, etc) – If that is not possible, interfaces could be added to LAGG instances so the assigned interfaces can match (e.g. lagg0 on both, rather than igb0 and em0) ● If the interface order does not match… – Areas such as firewall rules will appear to sync to the “wrong” interface on other nodes – Rules or other settings might “disappear” or otherwise break/misbehave (really ending up on the wrong interface)
  12. IP Address Requirements ● CARP requires a static IP address WAN for full functionality – DHCP or PPPoE WAN may work in some cases, but not seamless failover – For IPv6, static addressing is a hard requirement; DHCPv6 is not feasible ● Ideally, three IP addresses per subnet per address family (except Sync) – One IP address of each family type per node (e.g. one IPv4, one IPv6), plus at least one CARP VIP or each family for each interface – Each WAN should be a /29 or larger for IPv4, a standard IPv6 /64 is sufficient, but no smaller than 126 – Sync interface has no CARP and thus does not need an additional IP address, can be IPv4 only ● Single IPv4 address CARP is possible but not generally recommended – For WANs, it means only the active master may communicate out for gateway monitoring, updates, package installs, etc. ● OK for secondary or additional WANs so long as the firewalls do not need to reach outbound individually on that circuit ● If only an IPv4 /30 is possible on WAN, feel free to use it (with the above caveats) – LANs generally do not have IP address shortages so it can work there, but could break DHCP failover ● All nodes MUST connect to the all WANs identically, it is not feasible to have the primary on one ISP and the secondary on a different ISP – See the July 2016 hangout for Multi-WAN HA ● Reminder: IPv6 clusters must have separate routed subnets for each interface – Local subnets can all be under one larger subnet (e.g. LAN, DMZ, etc all under a /60 routed in via WAN)
  13. Check VHID/VRID Usage ● CARP/VRRP/HSRP use similar mechanisms ● MAC address of CARP VIP is determined by VHID – 00:00:5e:00:01:<VHID in hex> – See https://docs.google.com/spreadsheets/d/17CqR6iAAXHXfU0h4uatzuY0w7k0ijy22dES6 joLdTL0/edit?usp=sharing ● Overlapping IDs will cause MAC conflicts among other issues ● To find if any are in use... – Diag > Packet Capture, set Protocol to CARP – Capture for several seconds minimum – If any packets are observed, check VHID/VRID (may need to load capture in Wireshark) – Note the used ID(s) and avoid using them with CARP VIPs
  14. Basic Setup Pre-requisites ● Give each node a unique hostname – Ex: fw-a/fw-b, fw-pri/fw-sec, fw-1/fw-2 ● Adjust IP addresses on each node so they do not directly conflict – Ex: Primary LAN to 192.168.1.2, Secondary to .3 ● DHCP Must be disabled on the secondary until it is configured for HA ● GUI must be running same protocol and port on both nodes – Ex: HTTPS on port 443 ● The sync account (e.g. admin) cannot be disabled and must have the same password on both nodes – On 2.4, any user can be used for synchronization, provided it has the “System - HA node sync” privilege ● Both nodes must have a static IP address WAN configured in the same WAN subnet with a proper gateway and so on ● Both nodes must have DNS configured properly either using the DNS Resolver with forwarding disabled, or by having DNS servers set under System > General Setup
  15. Switch Setup ● CARP uses multicast, so the switch cannot block, filter, limit, or otherwise interfere with multicast – IGMP snooping on some switches can conflict and may need to be disabled ● Nearly all CARP status problems, such as dual master scenarios, are due to switch or other layer 2 issues ● The switch must, at least: – Allow Multicast traffic to be sent and received by the firewall without interference on ports using CARP VIPs. – Allow traffic to be sent and received by the firewall using multiple MAC addresses – Allow the CARP VIP MAC address to move between ports ● Virtual/Hypervisor switches often have issues with one or more of the above and require adjustments such as enabling Promiscuous Mode, Forged Transmits, and allowing MAC address changes
  16. Configuration! ● Reminder: Keep secondary disconnected from any network until it has a basic non-conflicting interface configuration – Otherwise it could cause problems with DHCP, IP address conflicts, etc. ● Setup Sync interface ● Configure pfsync ● Configure XMLRPC ● Add CARP VIPs ● Setup Manual Outbound NAT ● Setup DHCP ● Setup DHCPv6 / Router Advertisements ● VPNs, other services ● Adding more Interfaces
  17. Setup Sync Interface ● Interface config – Enable the interface, name it Sync – Set for a static IPv4 address in the chosen sync subnet ● Ex: Primary as 172.16.1.2/24, secondary as .3 – IPv6 is optional here, can be used for XMLRPC sync – Do not check Block Private Networks or Bogons ● Add firewall rules for sync – On Primary: ● Add rule to pass TCP/443 to on the Sync address for GUI ● Add rule to pass pfsync from Sync net to any. ● Optionally add a rule to pass ICMP echoreq to/from Sync net – On Secondary: ● Add rule to pass any protocol from Sync net to any ● Rule is different so it's obvious when it has been replaced by XMLRPC sync!
  18. Configure pfsync ● Enable on BOTH nodes! ● Navigate to System > High Avail Sync ● State Synchronization Settings (pfsync) section ● Check Synchronize States ● Set Synchronize Interface to Sync ● Set the pfsync Synchronize Peer IP to the IP address of the other node – Ex: On the Primary, enter the IP address of the secondary, 172.16.1.3, and vice versa. – Technically this setting is optional. Without it set, state sync is sent via multicast rather than unicast. With only two nodes, unicast is more reliable ● Click Save
  19. Configure XMLRPC ● Enable only on the Primary node! ● Navigate to System > High Avail Sync ● Configuration Synchronization Settings (XMLRPC Sync) section ● Set Synchronize Config to IP to the Sync interface IP address on the secondary node (e.g. 172.16.1.3) ● Set Remote System Username to admin – On 2.4+, any user will work so long as they are admin or have the “System - HA node sync” privilege ● Note that this account MUST exist on the secondary for sync to function! It is easier to use admin for now and change it after. – On 2.3.x and earlier, admin is the only user that will work ● Set Remote System Password to the sync account password ● Check the boxes for each area to synchronize ● Click Save ● On the secondary, check and see if the rule changes on the Sync interface carried over ● From here on, do not make any changes on the secondary in an area that will sync!
  20. Add CARP VIPs ● Navigate to Firewall > Virtual IPs on Primary node ● Add a CARP type VIP for each interface except Sync ● Skew on primary should be 0/1, secondary will end up higher via XMLRPC sync which adjusts the skew when copying ● Example: – WAN CARP VIP 198.51.100.200/24, VHID Group 200, random password/confirm, Base=1, Skew=0 – LAN CARP VIP 192.168.1.1/24, VHID=1, random password/confirm, Base=1, Skew=0 – Subnet mask must match the interface subnet! ● If a VIP is sensitive to latency (e.g. Secondary is in another building), try moving Base up by 1 until stability is achieved ● Check Status > CARP – If CARP shows disabled on either node, enable it – Primary should show MASTER, Secondary BACKUP ● Repeat for more VIPs as needed ● If there will be many VIPs in a single interface + subnet, use IP alias VIPs w/CARP VIP as their parent interface – Reduces CARP advertisements, and they switch as a group instead of individually ● Do NOT add CARP VIPs to interfaces that will be down/disabled! This will cause the firewall to demote itself, believing it has a problem.
  21. Setup NAT ● Navigate to Firewall > NAT, Outbound tab on Primary node ● Change Mode to Manual or Hybrid ● In hybrid mode: – Add new rules to translate from LAN(s) source – Set the Translation to the CARP WAN VIP ● In manual mode: – Edit each rule for a local interface (e.g. LAN) – Set the Translation to the CARP WAN VIP ● DO NOT SET A SOURCE OF “ANY” on the NAT rules! ● An RFC1918 alias helps here for source (192.168.0.0/16, 172.16.0.0/12, 10.0.0.0/8) ● After adding/editing all rules, click Apply Changes ● If/when additional interfaces are added in the future, rules must be added manually! ● Add port forwards, 1:1 NAT if needed. May need more WAN VIPs if using multiple IP addresses
  22. Setup DHCP (IPv4) ● Navigate to Services > DHCP Server, LAN tab on Primary ● Set DNS Server to the LAN CARP VIP ● Set Gateway to the LAN CARP VIP ● Set the Failover Peer IP to the actual LAN IP address of the secondary node, e.g. 192.168.1.3 ● Click Save ● Repeat for additional local interfaces if necessary ● Gateway must be the CARP VIP, DNS if using the firewall for DNS also ● Navigate to Status > DHCP Leases, check pool status, should be “normal”/”normal”
  23. Setup DHCPv6 / RA ● DHCPv6 has no concept of failover, so setup is tricky – DHCPv6 & RA settings will not sync due to this – There is no formal spec/RFC yet, only a draft with no implementation ● Navigate to Services > DHCPv6 Server & RA on Primary node ● Two main options: – Set RA to Unmanaged (SLAAC) and let clients determine their own addresses – Set RA to Managed + DHCPv6 independently using separate local pools ● e.g. Pri: x:x:x:x::1:0000-x:x:x:x::1:FFFF / Sec: x:x:x:x::2:0000-x:x:x:x::2:FFFF ● Gateway is handled by router advertisements, two choices there as well – On both, bind to CARP VIP, use Normal router priority ● radvd will start/stop with CARP status (preferred method) – Bind to LAN, set primary to High priority, set secondary to Low ● Set DNS to LAN CARP VIP in RA and DHCPv6 (if used)
  24. VPNs, Additional Interfaces ● If the firewalls are using the default DNS (DNS Resolver, Unbound) you must visit Services > DNS Resolver and press Save at least once, otherwise local clients cannot use the CARP VIP for DNS resolution. ● For VPNs and other local services, if they require binding to only one IP address, set the Interface to a CARP VIP (e.g. IPsec, OpenVPN) ● Support in packages varies but some have support for CARP VIPs, XMLRPC, or CARP status detection ● ACME package – Install it only on the primary – Make a cert with SAN entries for both hosts individually and the CARP VIP all in a single cert – Cert will sync to secondary automatically via XMLRPC since certificates already sync ● When adding a new local interface: – Assign the interface on both nodes identically – Enable the interface on both nodes, using different IP addresses within the same subnet – Add a CARP VIP inside the new subnet (Primary node only) – Add firewall rules (Primary node only) – Add Manual Outbound NAT for a source of the new subnet, utilizing the CARP VIP for translation (Primary node only) – Configure the DHCP/DHCPv6/RA server for the new subnet, utilizing the CARP VIP for DNS and Gateway roles (Optional, Primary node only)
  25. Testing ● Verify that a client on the LAN can pass through the cluster (ping / browse to Internet host) ● Verify XMLRPC by checking if a setting syncs, and via Status > Filter Reload, Force Config Sync ● Verify CARP by checking Status > CARP – If any VIPs show as INIT, then an interface is down. Fix the interface or remove the VIP. – Don’t be tempted by the “Reset Demotion Factor” button, it’s not a permanent fix. ● Verify state sync by checking pfsync nodes on Status > CARP and contents of Diag > States ● Testing Failover: – Status > CARP, enter maintenance mode or disable CARP on primary – Check status on secondary, should now be MASTER – Test LAN client connectivity, DHCP, etc – Enable CARP on Primary – Retest connectivity ● Downloading a file, streaming audio, or streaming video will most likely continue uninterrupted. VoIP-based phone calls may have a slight disruption as they are not buffered like the others.
  26. Troubleshooting ● Review the config ● Check CARP status, if VIP is INIT, check interface link, edit/save/apply VIP ● Check for conflicting VHIDs ● Check subnet mask on CARP VIPs ● Switch/L2 issues – Ensure boxes are on the correct switch/VLAN/L2 – Try to ping between the nodes on the affected interface – Ensure switches are properly trunking, if applicable – Try another switch (especially if using a modem/CPE switch) – Disable IGMP snooping, broadcast/multicast storm control, etc. – For vswitches, check promiscuous mode, forged transmits, MAC changes ● Check system logs, firewall logs, notices
  27. Upgrading ● Review changelog/blog/upgrade guide – https://doc.pfsense.org/index.php/Upgrade_Guide – https://doc.pfsense.org/index.php/Redundant_Firewalls_Upgrade_Guide ● Take a backup – Cannot stress this enough ● If the cluster is running an older (2.2.x or before) version, disable XMLRPC on primary ● Upgrade secondary ● Check secondary – Check logs, status pages, and so on to ensure everything looks OK ● Switch CARP to maintenance mode on primary (sticks across reboots) ● Test secondary, ensure proper connectivity and service function – If something fails, simply switch out of maintenance mode on the primary then repair the secondary ● Upgrade primary ● Exit maintenance mode on primary ● Test again ● If XMLRPC was disabled, enable it again, then test sync
  28. Conclusion ● July 2016 hangout in the archive has some advanced HA topics: – Multi-WAN with HA – Converting an existing single firewall to HA cluster ● For IPv6 basics, see the July 2015 hangout ● Questions? ● Ideas for hangout topics? Post on forum, comment on the blog posts, Reddit, etc
Advertisement