SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
Managing Open vSwitch Across a Large Heterogenous Fleet
Open vSwitch (OVS) is one of the more popular ways to provide VM connectivity in OpenStack. Rackspace has been using Open vSwitch in production since late 2011. In this session, we will detail the challenges faced with managing and upgrading Open vSwitch across a large heterogenous fleet. Finally, we will share some of the tools we have created to monitor OVS availability and performance.
Open vSwitch (OVS) is one of the more popular ways to provide VM connectivity in OpenStack. Rackspace has been using Open vSwitch in production since late 2011. In this session, we will detail the challenges faced with managing and upgrading Open vSwitch across a large heterogenous fleet. Finally, we will share some of the tools we have created to monitor OVS availability and performance.
Managing Open vSwitch Across a Large Heterogenous Fleet
1.
Managing Open vSwitch
Across a large heterogeneous fleet
Andy Hill @andyhky
Systems Engineer, Rackspace
Joel Preas @joelintheory
Systems Engineer, Rackspace
2.
Some Definitions
Large Fleet Heterogenous
• Several different hardware
manufacturers
• Several XenServer major versions
(sometimes on varying kernels)
• Five hardware profiles
• Six production public clouds
• Six internal private clouds
• Various non production environments
• Tens of thousands of hosts
• Hundreds of thousands of instances
4.
History
• Rackspace used Open vSwitch since the pre 1.0 days
• Behind most of First Generation Cloud Servers (Slicehost)
• Powers 100% of Next Generation Cloud Servers
• Upgraded OVS on Next Gen hypervisors 9 times over 2
years
5.
Upgrade Open vSwitch
If you get nothing else from this talk, upgrade OVS!
6.
Why upgrade?
Reasons we upgraded:
• Performance
• Less impacting upgrades
• NSX Controller version requirements
• Nasty regression in 2.1 [96be8de]
http://bit.do/OVS21Regression
• Performance
7.
Performance
• Broadcast domain sizing
• Special care in ingress broadcast flows
• Craft flows to explicitly allow destined broadcast traffic
8.
Performance
• The Dark Ages (< 1.11)
• Megaflows (>= 1.11)
• Ludicrous Speed (>= 2.1)
9.
The Dark Ages (< 1.11)
• Flow-eviction-threshold = 2000
• Single threaded
• 12 point match for datapath flow
• 8 upcall paths for datapath misses
• Userspace hit per bridge (2x the lookups)
10.
Megaflows (1.11+)
• Wildcard matching on datapath
• Less likely to hit flow-eviction-threshold
• Some workloads still had issues
• Most cases datapath flows cut in half or better
11.
Ludicrous speed (2.1+)
• RIP flow-eviction-threshold
• 200000 datapath flows (configurable)
• In the wild, we have seen over 72K datapath flows /
260K pps
13.
Mission Accomplished!
We moved the bottleneck!
New bottlenecks:
● Guest OS kernel configuration
● Xen Netback/Netfront Driver
14.
Upgrade, Upgrade, Upgrade
If you package Open vSwitch, don’t leave
your customers in The Dark Ages
Open vSwitch 2.3 is LTS
15.
Upgrade process
• Ansible Driven (async - watch your SSH timeouts)
• /etc/init.d/openvswitch force-reload-kmod
• bonded <= 30 sec of data plane impact
• non-bonded <=5 sec of data plane impact
http://bit.do/ovsupgrade
16.
Bridge Fail Modes
Secure vs. Normal bridge fail mode
Learning L2 switch, overriding default
• Critical XenServer bug with Windows causing full host
reboots (CTX140814)
• Bridge fail mode change is a datapath impacting event
• Fail modes do not persist across reboots in XenServer
unless in bridge other-config
17.
Patch Ports and Bridge Fail Modes
• Misconfigured patch ports + ‘Normal’ Bridge Fail mode
• Patches do not persist across reboots, cron.reboot to
set up- no hypervisor hook available
18.
Bridge Migration
OVS Upgrades required all bridges to be secured
1. Create new bridge
2. Move VIFs from old bridge to new bridge (loss of a
couple of packets)
3. Upgrade OVS
4. Ensure bridge fail mode change persists across reboot
5. Clean up
Entire process orchestrated with Ansible
19.
Kernel Modules
Running Kernel OVS Kernel Module Staged Kernel Reboot Outcome
vABC OVSvABC None Everything’s Fine
vABC OVSvABC vDEF No Networking
vABC OVSvABC, OVSvDEF vDEF Everything’s Fine
20.
Kernel Modules
• Ensure proper OVS kernel modules are in place
• Kernel Upgrade = OVS kernel module upgrade
• More packaging work to do for heterogenous
environment
• Failure to do so can force a trip to a Java console
21.
Other Challenges with OVS
• Tied to old version because $REASON
• VLAN Splinters/ovs-vlan-bug-workaround
• Hypervisor Integration
• Platforms: LXC, KVM, XenServer 5.5 and beyond
22.
Measuring OVS
PavlOVS sends these metrics to StatsD/graphite:
• Per bridge byte_count, packet_count, flow_count
• Instance count
• ovs CPU utilization
• Aggregate datapath flow_count, missed, hit, lost rates
These are aggregated per region->cell->host
Useful for DDoS detection (Graphite highestCurrent())
Scaling issues with Graphite/StatsD
23.
OVS in Compute Host Lifecycle
Ovsulate - Ansible Module that checks host into NVP/NSX controllers. Can
fail if routes bad or the host certificate changes, i.e. a host is re-kicked. First
made sure it failed explicitly, later added logic to delete existing on
provisioning.
24.
Monitoring OVS
Connectivity to SDN controller
• ovs-vsctl find manager is_connected=false
• ovs-vsctl find controller is_connected=false
SDN integration process (ovs-xapi-sync)
• pgrep -f ovs-xapi-sync
Routes
28.
Monitoring OVS
XSA-108 - AKA Rebootpocalypse 2014
• Incorrect kmods on reboot may require OOB access to fix!
• Had monitoring in place
• Pre-flight check for KMods just in case