Ceph Day Melbourne - Walk Through a Software Defined Everything PoC
1. Walk Through a Software
Defined Everything PoC
Tomohiko Kimura
Midokura
Andrew Hatfield
Red Hat
2. To enable data center services through an abstraction of compute,
network, and storage functionality into a pool of resources simply
consumed as a service.
Enable users to simply and expeditiously provision or decommission
an application
Automatically balance applications across data center resources to
optimize efficiency
Elastically scale out data center resources to meet application demands
Provide complete isolation to prevent unintended resource
sharing/intrusion
Be available all the time, tolerant to failure, and maintain data consistent
Elegantly metered, managed, monitored, and configured
Leverage universal on and off premises computing standards
Objectives
3. Use Case
OpenStack Neutron Production-Grade Plugin
Independent Control and Data Plane
Database/Topology Management
Dynamic Routing Protocol on GWs
Advanced VRF Features
Flow Tracing Troubleshooting
3
4. SDN distributed architecture that is built to
scale with enhanced security.
Leading SDS OpenStack solutions
Integration of Software Defined
Technologies with Enterprise OpenStack
Hardware Management and Monitoring
Proof of concept
5. Midonet Platform
v
Any Application
Midokura Enterprise MidoNet
Logical L2
Any Network Hardware
OpenStack, vSphere, Custom Platforms
Logical
Firewall
Logical Layer 4
Load Balancer
KVM, ESXi, LXC, Docker
Logical L3
Logical Switching – Layer 2 over Layer 3, decoupled from
the physical network
Logical Routing – Routing between virtual networks
without exiting the software container
Logical Firewall – Distributed Firewall, Kernel Integrated,
High Performance
Logical Layer 4 Load Balancer – Application Load
Balancing in software
MidoNet API – RESTful API for integration into any Cloud
Management Platform
Distributed Networking Services
14. MidoNet Configuration
Most default configurations used
Dedicated links for BGP Gateways
BGP timers made more aggressive
LRO off on Gateways
Port-groups for Gateways
Increased Client connections to Zookeeper
14
15. Replication 3x
64 OSDs spread equally between 8 nodes
SSD to HDD ratio 1:4
4096 Placement Groups based on formula:
Total PGs = (OSDs * 100) / # of replicas (increase to
next power of 2)
Aggregated 10Gb NICs with VLAN isolation for Public
and Cluster network
Ceph Configuration
16. MidoNet Lessons Learned
Plan out the Underlay (MTU, VLANs, IP) as later changes
affect OpenStack services like RabbitMQ
Connection count for Zookeeper matters else all midolmen
cannot join the party
Gateway failover was impressive!
Horizon doesn’t link enough Network objects
Flow Tracing makes debugging an SDN so much easier!
16
17. “Split Brain” for odd number of database nodes
Restarting RabbitMQ is not a simple task
Power Outages happens
VXLAN offload only works on single UDP port
OpenStack manual deployment is still a
complex process
OpenStack - Lessons Learned
18. Throughput - 5x Read vs Write
IOPS read (seq) BS 4K - more then 80K (no SSD)
SSD for journaling – double performance
No SSD for journaling – partition outer edge of the disk
20Gbps network bottleneck
OVS balance-tcp > Linux bond
Isolate Cluster and Public networks
SSD to HDD ratio and SSD size
Ceph – Lessons Learned
20. Rapid deployments and scale up of new applications
Reduced management cost and reduced complexity
Management tools that can support the management of thousands of
physical servers
Ability to scale to thousands of VMs per cloud administrator
Reduced cost per VM
Advanced and agile networking that uses Network Virtualization
Overlays
Tenant isolation over shared infrastructure
Simplified underlying Network infrastructure that uses open standards
L3 routing protocols
Improved IT productivity with reduced time to deploy resources
Business Benefits
Here we depict an overview of our architecture. The Key idea is that it’s components are completely distributed, and all active.
Our MidoNet Agent resides on each host in the network in a distributed fashion. The agent programs the kernels to handle flows from its respective VMs.
Gateways:
Several options: L3, L2, and VxLAN (HW VTEP on TOR)
Mention that they are fully distributed: no need for active/standby.
Dynamically add/remove gateways to scale up or down
Could run thousands if you needed, but a single gateway easily saturates 10G, and 40G with Mellanox option
Since the Midolman Agent is identical on the gateway as the host, the same behavior and functionality can be applied to incoming packets, like security groups, distributed load-balancing, etc.
Use any IP network as the underlay – recommend an L3 CLOS for a solid underlay.