SlideShare a Scribd company logo
1 of 25
OVN DBs HA with scale test
Aliasgar Ginwala (aginwala@ebay.com)
IRC: aginwala
What components can be improved with scale test?
● OVN-Controller on computes/GWs – ongoing discussions and WIP upstream
● OVS-vSwitchd on computes/GWs – performance improved with help of community.
● OVN-Northd on central nodes – ongoing discussions and WIP upstream
Why scale test?
● To see how OVN behaves when deployed at scale.
● Ensure an entire availability zone is simulated fine in big cloud deployments.
● Find out bugs as early as possible to improvise OVN.
What to use for scale test?
● OVN Scale test
○ When something fails, performs slowly or doesn't scale, it's really hard to answer different questions on "what", "why" and "where" without a solid
scalability testing framework.
○ Since OpenStack rally is very convenient benchmarking tool, OVN scale test leverages the same.
○ It is a plugin of OpenStack Rally.
○ It’s open sourced and maintained under same base project OpenvSwitch.
○ Intended to provide the community with a OVN control plane scalability test tool that is capable of
performing specific, complicated and reproducible test cases on simulated scenarios.
○ Need to have a Rally installed, as workflow is also similar to Rally’s.
○ Upstream scale test repo @ https://github.com/openvswitch/ovn-scale-test
○ User guide @ http://ovn-scale-test.readthedocs.org/en/latest/
Rally OVS
● To run OVN scale test, you don’t need OpenStack installed - instead you just need rally installed.
● Main keywords :
○ Deployment = any cloud deployment consisting of all network and compute components.
○ Task = Any CRUD operations on compute, farm and network components like lports, lswitches, lrouters, etc.
○ Farm = collection of sandboxes
○ Sandbox = a chassis (hypervisor/compute node/ovs sandbox)
Base counters considered for an availability zone
●8 lrouters
●5 lswitches per router
●250 lports per lswitches
●Total 10k lports
●Total Chassis: 1k
●Total BMs that hosts chassis: 20
● Total control plane nodes: 3
●10 lports(VM) per chassis
●OS: Ubuntu 16.04 with 4.4 kernel
OVSdb service models
● OVSDB supports three service models for databases:
○ Standalone
○ Active-Backup
○ Clustered
● The service models provide different compromises among consistency, availability, and partition tolerance.
● They also differ in the number of servers required and in terms of performance.
● The standalone and active-backup database service models share one on-disk format, and clustered databases use a different format [1]
1.https://github.com/openvswitch/ovs/blob/80c42f7f218fedd5841aa62d7e9774fc1f9e9b32/Documentation/ref/ovsdb.7.rst
OVN DBs Active-standby using pacemaker
NB
Northd
SB
NB
Northd
SB
NB
Northd
SB
Node1 Node2 Node3
CMS
LB VIP
LB VIP
NeutronCMS
HV HV HV...
Active
Standby
Pacemaker Cluster
Alternatively, this LB VIP can be replaced by:
● Option 2: BGP advertising the VIP
on each node
● Option 3: put all 3 nodes on same
rack and use pacemaker to
manage the VIP too.
Start OVN DBs using pacemaker
● Let pacemaker manage the VIP resource.
● Using LB VIP:
○ set listen_on_master_ip_only=no
○ Active node will listen on 0.0.0.0 so that LB VIP IP can connect on respective sb and nb db ports
pcs resource create ip-192.168.220.108 ocf:heartbeat:IPaddr2 ip=192.168.220.108 op monitor interval=30s
pcs resource create ovndb_servers ocf:ovn:ovndb-servers manage_northd=yes master_ip=192.168.220.108
nb_master_port=6641 sb_master_port=6640 --master
pcs resource meta ovndb_servers-master notify=true
pcs constraint order start ip-192.168.220.108 then promote ovndb_servers-master
pcs constraint colocation add ip-192.168.220.108 with master ovndb_servers-master
OVN DBs – Raft Clustering
NB
Northd
SB
NB
Northd
SB
NB
Northd
SB
Node1 Node2 Node3
CMS
LB VIP
LB VIP
NeutronCMS
HV HV HV...
Cluster Leader
Active
Standby
Northd uses OVSDB
named lock to ensure
only one is active
Starting OVN DBs using clustering
● For LB VIP:
○ Set connection table to listen on 0.0.0.0 on all nodes
● For chassis:
○ Point it to either VIP IP e.g. tcp:<vip_ip>:6642
○ Or all central node IPs e.g. “tcp:192.168.220.101:6642,tcp:192.168.220.102:6642,tcp:192.168.220.103:6642”
How to set up scale test env ?
• Create deployment which is installing necessary packages/binaries on a BM
– rally-ovs deployment create --file ovn-multihost.json --name ovn-overlay
{
"type": "OvnMultihostEngine",
"controller": {
"type": "OvnSandboxControllerEngine",
"deployment_name": "ovn-new-controller-node",
"ovs_repo": "https://github.com/openvswitch/ovs.git",
"ovs_branch": "branch-2.9",
"ovs_user": "root",
"net_dev": "eth0",
"controller_cidr": "192.168.10.10/16",
"provider": {
"type": "OvsSandboxProvider",
"credentials": [
{
"host": "10.x.x.x",
"user": "root"}
]
}
},
"nodes": [
{
"type": "OvnSandboxFarmEngine",
"deployment_name": "ovn-farm-node-31",
"ovs_repo" : "https://github.com/ openvswitch /ovs.git",
"ovs_branch" : "branch-2.9",
"ovs_user" : "root",
"provider": {
"type": "OvsSandboxProvider",
"credentials": [
{
"host": "10.x.x.x",
"user": "root"}
]
}
} ]
}
Rally-ovs
TOR
switch
OVN Farm1OVN central node
ssh ssh
OVN Farm20
.
.
ssh
How to set up scale test env ?
• Rally task start create_sandbox is equivalent to convert the BM into a compute node with ovs installed.
• rally-ovs task start create_sandbox.farm1.json
{
"version": 2,
"title": "Create sandbox",
"description": "Creates 50 sandboxes on each farm",
"tags": ["ovn", "sandbox"],
"subtasks": [
{
"title": "Create sandbox on farm 1",
"group": "ovn",
"description": "",
"tags": ["ovn", "sandbox"],
"run_in_parallel": false,
"workloads": [
{
"name": "OvnSandbox.create_sandbox",
"args": {
"sandbox_create_args": {
"farm": "ovn-farm-node-1",
"amount": 50,
"batch": 10,
"start_cidr": "192.230.64.0/16",
"net_dev": "eth0",
"tag": "TOR1"
}
},
"runner": {
"type": "constant",
"concurrency": 4,
"times": 1,
"max_cpu_count": 4
},
"context": {
"ovn_multihost" : {
"controller": "ovn-new-controller-node"
}
}
}
]
} ]
}
Rally-ovs
OVN Farm1
OVN central node
ssh ssh
TOR
switch
HV1 HV2
HV50
How to set up scale test env ?
• Finally create lrouters, lswitches and lports and also bind the lports to the chassis
• rally-ovs task start create_routers_bind_ports.json
{
"OvnNetwork.create_routers_bind_ports": [
{
"runner": {
"type": "serial",
"times": 1
},
"args": {
"port_create_args": {
"batch": 100
},
"router_create_args": {
"amount": 8,
"batch": 1
},
"network_create_args": {
"start_cidr": "172.145.1.0/24",
"batch": 1
},
"networks_per_router": 5,
"ports_per_network": 250,
"port_bind_args": {
"wait_up": true,
"wait_sync": "none"
}
},
"context": {
"sandbox": {},
"ovn_multihost": {
"controller": "ovn-new-controller-node"
}
}
}
]
}
Rally-ovs
OVN Farm1
OVN central node
ssh ssh
TOR
switch
HV1 HV2
HV50
lport1 lport20
lport500
..
OVN scale test with HA
● OVN scale test by default sets up one active standalone OVN DB.
● Hence, we need to separately setup an HA cluster
○ TODO: (support to deploy HA cluster to be added in OVN-scale-test to avoid manual setup)
● For testing HA, we need to point the chassis to HA nodes setup which can be set to respective OVN DB HA VIP IP
in the create_sandbox.json using below param
○ "controller_cidr": "192.168.10.10/16",
Scenarios – Active-standby using pacemaker
Scenarios Impact on Control plane Impact on Data plane
Standby node reboot No No
Active node reboot Yes (~5+ minutes as SB DB is
running super hot resyncing
the data)
Only newly created VMs/lports
till SB DB cools down.
All active and standby nodes
reboot
Yes (few minutes depending
on how soon is new node up
and data sync is finished)
No*
• *Entire NB db data got flushed/lost causing both control and data plane impact
• *Discussion @ https://mail.openvswitch.org/pipermail/ovs-discuss/2018-August/047161.html
• *Fixed rolled out with help of upstream and no issues reported so far.
• *Commit ecf44dd3b26904edf480ada1c72a22fadb6b1825
OVN DBs HA – Active-backup with pacemaker
● Current status
○ Basic functionality tested
○ Scale testing always ongoing with findings reported and some major issues fixed with help of upstream.
○ Detailed scale test scenarios reported and also updated on mail chain to the community https://mail.openvswitch.org/pipermail/ovs-
discuss/2018-September/047405.html
○ Consent and improvements asked to upstream folks
Scenarios – Clustered DBs
Scenarios Impact on Control plane Impact on Data plane
Any active node reboot No No
All active nodes reboot Yes (few minutes depending
on how soon is new node up
along with leader selection and
data sync completion)
Not fully verified
Raft with scale test summary
● Current status
○ Basic functionality tested.
○ Scale testing ongoing and problems found when using rally-ovs (ovn scale test) with around 2k+ lports
○ db="tcp:192.168.220.101:6641,tcp:192.168.220.102:6641,tcp:192.168.220.103:6641" -- wait-until Logical_Switch_Port
lport_061655_SKbDHz up=true -- wait-until Logical_Switch_Port lport_061655_zx9LXe up=true -- wait-until
Logical_Switch_Port Last stderr data: 'ovn-nbctl:
tcp:192.168.220.101:6641,tcp:192.168.220.102:6641,tcp:192.168.220.103:6641: database connection failed (End of
file)n'.", "traceback": "Traceback (most recent call last):n File "/ebay/home/aginwala/rally-repo/rally/rally/task/runner.py",
line 66, in _run_scenario_oncen
○ Following up with community to get it fixed soon with discussions @ https://mail.openvswitch.org/pipermail/ovs-dev/2018-
May/347260.html
○ Upstream also have raft torture test in test cases in ovs repo for testing locally.
Some tunings for both clustered and non clustered setup
• Netfilter TCP params on all central nodes:
– Since tcp_max_syn_backlog and net.core.somaxconn values are too small, we need to increase the value to avoid getting TCP sync flood
messages in syslog:
• net.ipv4.tcp_max_syn_backlog = 4096
• net.core.somaxconn = 4096
• Pacemaker configurations
– When the SB DB starts on the new active node, it will be very busy on syncing data to all HVs.
– During this time, pacemaker monitoring can get timed out. Because of this, the timeout value for "op monitor" needs to be set big enough to
avoid timeout to avoid restart/failover forever.
– Hence, configure pacemaker monitor for resource ovndb-servers: op monitor interval=60s timeout=50s
• Inactivity probe settings on all chassis
– Set inactivity probe to 3min, so that central SB DB won't get overloaded for probe handling and also if failover happens, chassis will be able
to notice the changes
• Upstart settings on all central nodes when using pacemaker:
– Disable ovn-central and openvswitch-switch upstart to avoid confusing pacemaker when node reboots because pacemaker thinks there is
already an active pid and all the nodes will act as standalone nodes. Also LB gets confused sending traffic to this standby node.
Promising outcome and more to go
• OVS-vswitchd CPU utilization was running super high on chassis.
• Performance improved by making ofproto faster and results are amazing; test completed in 3+ hours vs 8+ hours:
• Discussion @ https://mail.openvswitch.org/pipermail/ovs-discuss/2018-February/046140.html
• Commit c381bca52f629f3d35f00471dcd10cba1a9a3d99
CPU/Mem stats for active-standby
• Active Central node
• Chassis
Components CPU Mem
OVN NB DB 0.12 97392000
OVN SB DB 0.92 777028000
OVN Northd 6.78 825836000
Components CPU Mem
OVSDB server 0.02 11672000
OVS-vSwitchd 3.75 152812000
OVN-controller 0.94 839188000
Note:
• Mem: RES mem in bytes whether its mb, gb or tb.
• CPU: total CPU time, the task has used since it started.
e.g. if the total cpu time in seconds for a current ovn-controller process is 6:26.90,
we convert the same into integer seconds by following time conversion formula:
6 * 6000 + 26 * 100 + 90 = 38690
• Converted in Delta (speed per second)
Stuck?
• Reach out to OVS community as it’s super interactive and responsive.
• For any generic OVS queries/tech discussions use ovs-discuss@openvswitch.org so that wide variety of engineers can respond for the same.
Thank You

More Related Content

What's hot

[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개OpenStack Korea Community
 
Openstack Neutron, interconnections with BGP/MPLS VPNs
Openstack Neutron, interconnections with BGP/MPLS VPNsOpenstack Neutron, interconnections with BGP/MPLS VPNs
Openstack Neutron, interconnections with BGP/MPLS VPNsThomas Morin
 
OVN Controller Incremental Processing
OVN Controller Incremental ProcessingOVN Controller Incremental Processing
OVN Controller Incremental ProcessingHan Zhou
 
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...OpenStack Korea Community
 
[OpenStack] 공개 소프트웨어 오픈스택 입문 & 파헤치기
[OpenStack] 공개 소프트웨어 오픈스택 입문 & 파헤치기[OpenStack] 공개 소프트웨어 오픈스택 입문 & 파헤치기
[OpenStack] 공개 소프트웨어 오픈스택 입문 & 파헤치기Ian Choi
 
오픈스택 멀티노드 설치 후기
오픈스택 멀티노드 설치 후기오픈스택 멀티노드 설치 후기
오픈스택 멀티노드 설치 후기영우 김
 
[오픈소스컨설팅]오픈스택에 대하여
[오픈소스컨설팅]오픈스택에 대하여[오픈소스컨설팅]오픈스택에 대하여
[오픈소스컨설팅]오픈스택에 대하여Ji-Woong Choi
 
Open vSwitch 패킷 처리 구조
Open vSwitch 패킷 처리 구조Open vSwitch 패킷 처리 구조
Open vSwitch 패킷 처리 구조Seung-Hoon Baek
 
NGINX: Basics and Best Practices
NGINX: Basics and Best PracticesNGINX: Basics and Best Practices
NGINX: Basics and Best PracticesNGINX, Inc.
 
[2018] 오픈스택 5년 운영의 경험
[2018] 오픈스택 5년 운영의 경험[2018] 오픈스택 5년 운영의 경험
[2018] 오픈스택 5년 운영의 경험NHN FORWARD
 
OpenStack Networking
OpenStack NetworkingOpenStack Networking
OpenStack NetworkingIlya Shakhat
 
BlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InBlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InSage Weil
 
Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph Community
 
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Thomas Graf
 
Room 3 - 7 - Nguyễn Như Phúc Huy - Vitastor: a fast and simple Ceph-like bloc...
Room 3 - 7 - Nguyễn Như Phúc Huy - Vitastor: a fast and simple Ceph-like bloc...Room 3 - 7 - Nguyễn Như Phúc Huy - Vitastor: a fast and simple Ceph-like bloc...
Room 3 - 7 - Nguyễn Như Phúc Huy - Vitastor: a fast and simple Ceph-like bloc...Vietnam Open Infrastructure User Group
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep DiveRed_Hat_Storage
 

What's hot (20)

Deploying IPv6 on OpenStack
Deploying IPv6 on OpenStackDeploying IPv6 on OpenStack
Deploying IPv6 on OpenStack
 
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
[OpenInfra Days Korea 2018] (Track 2) Neutron LBaaS 어디까지 왔니? - Octavia 소개
 
Neutron packet logging framework
Neutron packet logging frameworkNeutron packet logging framework
Neutron packet logging framework
 
Openstack Neutron, interconnections with BGP/MPLS VPNs
Openstack Neutron, interconnections with BGP/MPLS VPNsOpenstack Neutron, interconnections with BGP/MPLS VPNs
Openstack Neutron, interconnections with BGP/MPLS VPNs
 
OVN Controller Incremental Processing
OVN Controller Incremental ProcessingOVN Controller Incremental Processing
OVN Controller Incremental Processing
 
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
 
[OpenStack] 공개 소프트웨어 오픈스택 입문 & 파헤치기
[OpenStack] 공개 소프트웨어 오픈스택 입문 & 파헤치기[OpenStack] 공개 소프트웨어 오픈스택 입문 & 파헤치기
[OpenStack] 공개 소프트웨어 오픈스택 입문 & 파헤치기
 
오픈스택 멀티노드 설치 후기
오픈스택 멀티노드 설치 후기오픈스택 멀티노드 설치 후기
오픈스택 멀티노드 설치 후기
 
[오픈소스컨설팅]오픈스택에 대하여
[오픈소스컨설팅]오픈스택에 대하여[오픈소스컨설팅]오픈스택에 대하여
[오픈소스컨설팅]오픈스택에 대하여
 
Open vSwitch 패킷 처리 구조
Open vSwitch 패킷 처리 구조Open vSwitch 패킷 처리 구조
Open vSwitch 패킷 처리 구조
 
NGINX: Basics and Best Practices
NGINX: Basics and Best PracticesNGINX: Basics and Best Practices
NGINX: Basics and Best Practices
 
[2018] 오픈스택 5년 운영의 경험
[2018] 오픈스택 5년 운영의 경험[2018] 오픈스택 5년 운영의 경험
[2018] 오픈스택 5년 운영의 경험
 
OpenStack Networking
OpenStack NetworkingOpenStack Networking
OpenStack Networking
 
BlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year InBlueStore, A New Storage Backend for Ceph, One Year In
BlueStore, A New Storage Backend for Ceph, One Year In
 
Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph RBD Update - June 2021
Ceph RBD Update - June 2021
 
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
Taking Security Groups to Ludicrous Speed with OVS (OpenStack Summit 2015)
 
Room 3 - 7 - Nguyễn Như Phúc Huy - Vitastor: a fast and simple Ceph-like bloc...
Room 3 - 7 - Nguyễn Như Phúc Huy - Vitastor: a fast and simple Ceph-like bloc...Room 3 - 7 - Nguyễn Như Phúc Huy - Vitastor: a fast and simple Ceph-like bloc...
Room 3 - 7 - Nguyễn Như Phúc Huy - Vitastor: a fast and simple Ceph-like bloc...
 
The Open vSwitch and OVN Projects
The Open vSwitch and OVN ProjectsThe Open vSwitch and OVN Projects
The Open vSwitch and OVN Projects
 
Ceph Block Devices: A Deep Dive
Ceph Block Devices:  A Deep DiveCeph Block Devices:  A Deep Dive
Ceph Block Devices: A Deep Dive
 
Ceph issue 해결 사례
Ceph issue 해결 사례Ceph issue 해결 사례
Ceph issue 해결 사례
 

Similar to OVN DBs HA with scale test

Testing kubernetes and_open_shift_at_scale_20170209
Testing kubernetes and_open_shift_at_scale_20170209Testing kubernetes and_open_shift_at_scale_20170209
Testing kubernetes and_open_shift_at_scale_20170209mffiedler
 
VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld
 
Scaling Up Logging and Metrics
Scaling Up Logging and MetricsScaling Up Logging and Metrics
Scaling Up Logging and MetricsRicardo Lourenço
 
Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...
Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...
Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...Stephen Gordon
 
Managing Open vSwitch Across a Large Heterogenous Fleet
Managing Open vSwitch Across a Large Heterogenous FleetManaging Open vSwitch Across a Large Heterogenous Fleet
Managing Open vSwitch Across a Large Heterogenous Fleetandyhky
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Belmiro Moreira
 
Implementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud TenantImplementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud TenantShixiong Shang
 
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...Vietnam Open Infrastructure User Group
 
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephBuild an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephRongze Zhu
 
Rook - cloud-native storage
Rook - cloud-native storageRook - cloud-native storage
Rook - cloud-native storageKarol Chrapek
 
Ceph at Work in Bloomberg: Object Store, RBD and OpenStack
Ceph at Work in Bloomberg: Object Store, RBD and OpenStackCeph at Work in Bloomberg: Object Store, RBD and OpenStack
Ceph at Work in Bloomberg: Object Store, RBD and OpenStackRed_Hat_Storage
 
Scaling Kubernetes to Support 50000 Services.pptx
Scaling Kubernetes to Support 50000 Services.pptxScaling Kubernetes to Support 50000 Services.pptx
Scaling Kubernetes to Support 50000 Services.pptxthaond2
 
Network Automation with Salt and NAPALM: Introuction
Network Automation with Salt and NAPALM: IntrouctionNetwork Automation with Salt and NAPALM: Introuction
Network Automation with Salt and NAPALM: IntrouctionCloudflare
 
Open v switch20150410b
Open v switch20150410bOpen v switch20150410b
Open v switch20150410bRichard Kuo
 
Distributed Performance testing by funkload
Distributed Performance testing by funkloadDistributed Performance testing by funkload
Distributed Performance testing by funkloadAkhil Singh
 
Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017 - ...
Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017  - ...Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017  - ...
Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017 - ...Haidee McMahon
 
Ceph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph QoS: How to support QoS in distributed storage system - Taewoong KimCeph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph QoS: How to support QoS in distributed storage system - Taewoong KimCeph Community
 
Stacks and Layers: Integrating P4, C, OVS and OpenStack
Stacks and Layers: Integrating P4, C, OVS and OpenStackStacks and Layers: Integrating P4, C, OVS and OpenStack
Stacks and Layers: Integrating P4, C, OVS and OpenStackOpen-NFP
 

Similar to OVN DBs HA with scale test (20)

Testing kubernetes and_open_shift_at_scale_20170209
Testing kubernetes and_open_shift_at_scale_20170209Testing kubernetes and_open_shift_at_scale_20170209
Testing kubernetes and_open_shift_at_scale_20170209
 
VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep Dive
 
Scaling Up Logging and Metrics
Scaling Up Logging and MetricsScaling Up Logging and Metrics
Scaling Up Logging and Metrics
 
Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...
Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...
Containers for the Enterprise: Delivering OpenShift on OpenStack for Performa...
 
Managing Open vSwitch Across a Large Heterogenous Fleet
Managing Open vSwitch Across a Large Heterogenous FleetManaging Open vSwitch Across a Large Heterogenous Fleet
Managing Open vSwitch Across a Large Heterogenous Fleet
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
 
Implementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud TenantImplementing an IPv6 Enabled Environment for a Public Cloud Tenant
Implementing an IPv6 Enabled Environment for a Public Cloud Tenant
 
nested-kvm
nested-kvmnested-kvm
nested-kvm
 
Neutron scaling
Neutron scalingNeutron scaling
Neutron scaling
 
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...
 
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on CephBuild an High-Performance and High-Durable Block Storage Service Based on Ceph
Build an High-Performance and High-Durable Block Storage Service Based on Ceph
 
Rook - cloud-native storage
Rook - cloud-native storageRook - cloud-native storage
Rook - cloud-native storage
 
Ceph at Work in Bloomberg: Object Store, RBD and OpenStack
Ceph at Work in Bloomberg: Object Store, RBD and OpenStackCeph at Work in Bloomberg: Object Store, RBD and OpenStack
Ceph at Work in Bloomberg: Object Store, RBD and OpenStack
 
Scaling Kubernetes to Support 50000 Services.pptx
Scaling Kubernetes to Support 50000 Services.pptxScaling Kubernetes to Support 50000 Services.pptx
Scaling Kubernetes to Support 50000 Services.pptx
 
Network Automation with Salt and NAPALM: Introuction
Network Automation with Salt and NAPALM: IntrouctionNetwork Automation with Salt and NAPALM: Introuction
Network Automation with Salt and NAPALM: Introuction
 
Open v switch20150410b
Open v switch20150410bOpen v switch20150410b
Open v switch20150410b
 
Distributed Performance testing by funkload
Distributed Performance testing by funkloadDistributed Performance testing by funkload
Distributed Performance testing by funkload
 
Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017 - ...
Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017  - ...Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017  - ...
Intel's Out of the Box Network Developers Ireland Meetup on March 29 2017 - ...
 
Ceph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph QoS: How to support QoS in distributed storage system - Taewoong KimCeph QoS: How to support QoS in distributed storage system - Taewoong Kim
Ceph QoS: How to support QoS in distributed storage system - Taewoong Kim
 
Stacks and Layers: Integrating P4, C, OVS and OpenStack
Stacks and Layers: Integrating P4, C, OVS and OpenStackStacks and Layers: Integrating P4, C, OVS and OpenStack
Stacks and Layers: Integrating P4, C, OVS and OpenStack
 

Recently uploaded

Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and usesDevarapalliHaritha
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 

Recently uploaded (20)

Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
power system scada applications and uses
power system scada applications and usespower system scada applications and uses
power system scada applications and uses
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 

OVN DBs HA with scale test

  • 1. OVN DBs HA with scale test Aliasgar Ginwala (aginwala@ebay.com) IRC: aginwala
  • 2. What components can be improved with scale test? ● OVN-Controller on computes/GWs – ongoing discussions and WIP upstream ● OVS-vSwitchd on computes/GWs – performance improved with help of community. ● OVN-Northd on central nodes – ongoing discussions and WIP upstream
  • 3. Why scale test? ● To see how OVN behaves when deployed at scale. ● Ensure an entire availability zone is simulated fine in big cloud deployments. ● Find out bugs as early as possible to improvise OVN.
  • 4. What to use for scale test? ● OVN Scale test ○ When something fails, performs slowly or doesn't scale, it's really hard to answer different questions on "what", "why" and "where" without a solid scalability testing framework. ○ Since OpenStack rally is very convenient benchmarking tool, OVN scale test leverages the same. ○ It is a plugin of OpenStack Rally. ○ It’s open sourced and maintained under same base project OpenvSwitch. ○ Intended to provide the community with a OVN control plane scalability test tool that is capable of performing specific, complicated and reproducible test cases on simulated scenarios. ○ Need to have a Rally installed, as workflow is also similar to Rally’s. ○ Upstream scale test repo @ https://github.com/openvswitch/ovn-scale-test ○ User guide @ http://ovn-scale-test.readthedocs.org/en/latest/
  • 5. Rally OVS ● To run OVN scale test, you don’t need OpenStack installed - instead you just need rally installed. ● Main keywords : ○ Deployment = any cloud deployment consisting of all network and compute components. ○ Task = Any CRUD operations on compute, farm and network components like lports, lswitches, lrouters, etc. ○ Farm = collection of sandboxes ○ Sandbox = a chassis (hypervisor/compute node/ovs sandbox)
  • 6. Base counters considered for an availability zone ●8 lrouters ●5 lswitches per router ●250 lports per lswitches ●Total 10k lports ●Total Chassis: 1k ●Total BMs that hosts chassis: 20 ● Total control plane nodes: 3 ●10 lports(VM) per chassis ●OS: Ubuntu 16.04 with 4.4 kernel
  • 7. OVSdb service models ● OVSDB supports three service models for databases: ○ Standalone ○ Active-Backup ○ Clustered ● The service models provide different compromises among consistency, availability, and partition tolerance. ● They also differ in the number of servers required and in terms of performance. ● The standalone and active-backup database service models share one on-disk format, and clustered databases use a different format [1] 1.https://github.com/openvswitch/ovs/blob/80c42f7f218fedd5841aa62d7e9774fc1f9e9b32/Documentation/ref/ovsdb.7.rst
  • 8. OVN DBs Active-standby using pacemaker NB Northd SB NB Northd SB NB Northd SB Node1 Node2 Node3 CMS LB VIP LB VIP NeutronCMS HV HV HV... Active Standby Pacemaker Cluster Alternatively, this LB VIP can be replaced by: ● Option 2: BGP advertising the VIP on each node ● Option 3: put all 3 nodes on same rack and use pacemaker to manage the VIP too.
  • 9. Start OVN DBs using pacemaker ● Let pacemaker manage the VIP resource. ● Using LB VIP: ○ set listen_on_master_ip_only=no ○ Active node will listen on 0.0.0.0 so that LB VIP IP can connect on respective sb and nb db ports pcs resource create ip-192.168.220.108 ocf:heartbeat:IPaddr2 ip=192.168.220.108 op monitor interval=30s pcs resource create ovndb_servers ocf:ovn:ovndb-servers manage_northd=yes master_ip=192.168.220.108 nb_master_port=6641 sb_master_port=6640 --master pcs resource meta ovndb_servers-master notify=true pcs constraint order start ip-192.168.220.108 then promote ovndb_servers-master pcs constraint colocation add ip-192.168.220.108 with master ovndb_servers-master
  • 10. OVN DBs – Raft Clustering NB Northd SB NB Northd SB NB Northd SB Node1 Node2 Node3 CMS LB VIP LB VIP NeutronCMS HV HV HV... Cluster Leader Active Standby Northd uses OVSDB named lock to ensure only one is active
  • 11. Starting OVN DBs using clustering ● For LB VIP: ○ Set connection table to listen on 0.0.0.0 on all nodes ● For chassis: ○ Point it to either VIP IP e.g. tcp:<vip_ip>:6642 ○ Or all central node IPs e.g. “tcp:192.168.220.101:6642,tcp:192.168.220.102:6642,tcp:192.168.220.103:6642”
  • 12. How to set up scale test env ? • Create deployment which is installing necessary packages/binaries on a BM – rally-ovs deployment create --file ovn-multihost.json --name ovn-overlay { "type": "OvnMultihostEngine", "controller": { "type": "OvnSandboxControllerEngine", "deployment_name": "ovn-new-controller-node", "ovs_repo": "https://github.com/openvswitch/ovs.git", "ovs_branch": "branch-2.9", "ovs_user": "root", "net_dev": "eth0", "controller_cidr": "192.168.10.10/16", "provider": { "type": "OvsSandboxProvider", "credentials": [ { "host": "10.x.x.x", "user": "root"} ] } }, "nodes": [ { "type": "OvnSandboxFarmEngine", "deployment_name": "ovn-farm-node-31", "ovs_repo" : "https://github.com/ openvswitch /ovs.git", "ovs_branch" : "branch-2.9", "ovs_user" : "root", "provider": { "type": "OvsSandboxProvider", "credentials": [ { "host": "10.x.x.x", "user": "root"} ] } } ] } Rally-ovs TOR switch OVN Farm1OVN central node ssh ssh OVN Farm20 . . ssh
  • 13. How to set up scale test env ? • Rally task start create_sandbox is equivalent to convert the BM into a compute node with ovs installed. • rally-ovs task start create_sandbox.farm1.json { "version": 2, "title": "Create sandbox", "description": "Creates 50 sandboxes on each farm", "tags": ["ovn", "sandbox"], "subtasks": [ { "title": "Create sandbox on farm 1", "group": "ovn", "description": "", "tags": ["ovn", "sandbox"], "run_in_parallel": false, "workloads": [ { "name": "OvnSandbox.create_sandbox", "args": { "sandbox_create_args": { "farm": "ovn-farm-node-1", "amount": 50, "batch": 10, "start_cidr": "192.230.64.0/16", "net_dev": "eth0", "tag": "TOR1" } }, "runner": { "type": "constant", "concurrency": 4, "times": 1, "max_cpu_count": 4 }, "context": { "ovn_multihost" : { "controller": "ovn-new-controller-node" } } } ] } ] } Rally-ovs OVN Farm1 OVN central node ssh ssh TOR switch HV1 HV2 HV50
  • 14. How to set up scale test env ? • Finally create lrouters, lswitches and lports and also bind the lports to the chassis • rally-ovs task start create_routers_bind_ports.json { "OvnNetwork.create_routers_bind_ports": [ { "runner": { "type": "serial", "times": 1 }, "args": { "port_create_args": { "batch": 100 }, "router_create_args": { "amount": 8, "batch": 1 }, "network_create_args": { "start_cidr": "172.145.1.0/24", "batch": 1 }, "networks_per_router": 5, "ports_per_network": 250, "port_bind_args": { "wait_up": true, "wait_sync": "none" } }, "context": { "sandbox": {}, "ovn_multihost": { "controller": "ovn-new-controller-node" } } } ] } Rally-ovs OVN Farm1 OVN central node ssh ssh TOR switch HV1 HV2 HV50 lport1 lport20 lport500 ..
  • 15.
  • 16. OVN scale test with HA ● OVN scale test by default sets up one active standalone OVN DB. ● Hence, we need to separately setup an HA cluster ○ TODO: (support to deploy HA cluster to be added in OVN-scale-test to avoid manual setup) ● For testing HA, we need to point the chassis to HA nodes setup which can be set to respective OVN DB HA VIP IP in the create_sandbox.json using below param ○ "controller_cidr": "192.168.10.10/16",
  • 17. Scenarios – Active-standby using pacemaker Scenarios Impact on Control plane Impact on Data plane Standby node reboot No No Active node reboot Yes (~5+ minutes as SB DB is running super hot resyncing the data) Only newly created VMs/lports till SB DB cools down. All active and standby nodes reboot Yes (few minutes depending on how soon is new node up and data sync is finished) No* • *Entire NB db data got flushed/lost causing both control and data plane impact • *Discussion @ https://mail.openvswitch.org/pipermail/ovs-discuss/2018-August/047161.html • *Fixed rolled out with help of upstream and no issues reported so far. • *Commit ecf44dd3b26904edf480ada1c72a22fadb6b1825
  • 18. OVN DBs HA – Active-backup with pacemaker ● Current status ○ Basic functionality tested ○ Scale testing always ongoing with findings reported and some major issues fixed with help of upstream. ○ Detailed scale test scenarios reported and also updated on mail chain to the community https://mail.openvswitch.org/pipermail/ovs- discuss/2018-September/047405.html ○ Consent and improvements asked to upstream folks
  • 19. Scenarios – Clustered DBs Scenarios Impact on Control plane Impact on Data plane Any active node reboot No No All active nodes reboot Yes (few minutes depending on how soon is new node up along with leader selection and data sync completion) Not fully verified
  • 20. Raft with scale test summary ● Current status ○ Basic functionality tested. ○ Scale testing ongoing and problems found when using rally-ovs (ovn scale test) with around 2k+ lports ○ db="tcp:192.168.220.101:6641,tcp:192.168.220.102:6641,tcp:192.168.220.103:6641" -- wait-until Logical_Switch_Port lport_061655_SKbDHz up=true -- wait-until Logical_Switch_Port lport_061655_zx9LXe up=true -- wait-until Logical_Switch_Port Last stderr data: 'ovn-nbctl: tcp:192.168.220.101:6641,tcp:192.168.220.102:6641,tcp:192.168.220.103:6641: database connection failed (End of file)n'.", "traceback": "Traceback (most recent call last):n File "/ebay/home/aginwala/rally-repo/rally/rally/task/runner.py", line 66, in _run_scenario_oncen ○ Following up with community to get it fixed soon with discussions @ https://mail.openvswitch.org/pipermail/ovs-dev/2018- May/347260.html ○ Upstream also have raft torture test in test cases in ovs repo for testing locally.
  • 21. Some tunings for both clustered and non clustered setup • Netfilter TCP params on all central nodes: – Since tcp_max_syn_backlog and net.core.somaxconn values are too small, we need to increase the value to avoid getting TCP sync flood messages in syslog: • net.ipv4.tcp_max_syn_backlog = 4096 • net.core.somaxconn = 4096 • Pacemaker configurations – When the SB DB starts on the new active node, it will be very busy on syncing data to all HVs. – During this time, pacemaker monitoring can get timed out. Because of this, the timeout value for "op monitor" needs to be set big enough to avoid timeout to avoid restart/failover forever. – Hence, configure pacemaker monitor for resource ovndb-servers: op monitor interval=60s timeout=50s • Inactivity probe settings on all chassis – Set inactivity probe to 3min, so that central SB DB won't get overloaded for probe handling and also if failover happens, chassis will be able to notice the changes • Upstart settings on all central nodes when using pacemaker: – Disable ovn-central and openvswitch-switch upstart to avoid confusing pacemaker when node reboots because pacemaker thinks there is already an active pid and all the nodes will act as standalone nodes. Also LB gets confused sending traffic to this standby node.
  • 22. Promising outcome and more to go • OVS-vswitchd CPU utilization was running super high on chassis. • Performance improved by making ofproto faster and results are amazing; test completed in 3+ hours vs 8+ hours: • Discussion @ https://mail.openvswitch.org/pipermail/ovs-discuss/2018-February/046140.html • Commit c381bca52f629f3d35f00471dcd10cba1a9a3d99
  • 23. CPU/Mem stats for active-standby • Active Central node • Chassis Components CPU Mem OVN NB DB 0.12 97392000 OVN SB DB 0.92 777028000 OVN Northd 6.78 825836000 Components CPU Mem OVSDB server 0.02 11672000 OVS-vSwitchd 3.75 152812000 OVN-controller 0.94 839188000 Note: • Mem: RES mem in bytes whether its mb, gb or tb. • CPU: total CPU time, the task has used since it started. e.g. if the total cpu time in seconds for a current ovn-controller process is 6:26.90, we convert the same into integer seconds by following time conversion formula: 6 * 6000 + 26 * 100 + 90 = 38690 • Converted in Delta (speed per second)
  • 24. Stuck? • Reach out to OVS community as it’s super interactive and responsive. • For any generic OVS queries/tech discussions use ovs-discuss@openvswitch.org so that wide variety of engineers can respond for the same.

Editor's Notes

  1. Architecture Diagram credits: Han Zhou <hzhou8@ebay.com>
  2. Architecture Diagram credits: Han Zhou <hzhou8@ebay.com>