Software Defined Everything infrastructure that virtualizes compute, network, and storage resources and delivers it as a service. Rather than by the hardware components of the infrastructure, the management and control of the compute, network, and storage infrastructure are automated by intelligent software that is running on the Lenovo x86 platform.
1. Front cover
Lenovo and Midokura
OpenStack PoC
Software-Defined Everything
Describes a validated, scale-out
Proof of Concept (PoC)
implementation of OpenStack
Provides business and technical
reasons for Software Defined
Environments
Explains advantages of Lenovo
Hardware and MidoNet’s
OpenStack Neutron plugin
Describes the configurations for
building an agile cloud with a
distributed architecture
Krzysztof (Chris) Janiszewski
Michael Lea
Cynthia Thomas
Susan Wu
Check for Updates
4. 4 Lenovo and Midokura OpenStack PoC: Software-Defined Everything
Business objectives
The idea of an ephemeral virtual machine (VM) is gaining traction in the enterprise for
application provisioning and decommissioning. Unlike stand-alone offerings that are provided
by public cloud providers, this type of compute service can be all-inclusive: compute, network,
and storage services are abstracted into pooled services and users are presented with
a la carte choices. Moving to this model provides organizations with an elegantly metered,
monitored, and managed style of computing while offering complete isolation and automated
application-level load balancing.
Lenovo and Midokura helped an organization implement this model. The old workflow
required three teams working simultaneously and processes being ping-ponged across the
three teams. The post-OpenStack workflow provides cleaner hand-offs and removes the
redundant tasks and rework. The streamlined workflow that was provided by implementing
OpenStack was the key to providing the operational efficiency and agility this organization is
looking for.
Figure 1 shows the old workflow the organization that was used and the new workflow that
uses OpenStack.
Figure 1 Comparing the workflow
The Lenovo-Midokura design offers the following technical and business benefits:
Rapid deployments and scale up of new applications
Reduced management cost and reduced complexity
Enterprise class machines that are ideal for cloud-based environments
Management tools that can support the management of thousands of physical servers
Ability to scale to thousands of VMs per cloud administrator
Reduced cost per VM
Advanced and agile networking that uses Network Virtualization Overlays
Tenant isolation over shared infrastructure
Team A Team B Team C
Current Workflow
Request
Development VM (s)
Request
Development VM (s) Development
Development Work plan and
cookbook creation
Validate / Review
Health check URLs
Validate / Review
Health check URLs
Build Dev VM(s) Deploy OS Build Staging
VM(s)
Deploy OS Build Sandbox
VM(s)
Deploy
App to Sandbox
Build and
Deploy to Prod.
Troubleshoot /
Validate ACLs
Troubleshoot /
Validate ACLs
App
Deployment
Complete
Engage
Team C if
issues arise
Engage
Team C if
issues arise
Engage
Team C if
issues arise
Engage
Team C if
issues arise
Engage
Team C if
issues arise
Post On-Premise OpenStack Cloud Workflow
Legend |
Work Plan and
Cookbook Creation
Deploy App to
Sandbox
Validate Review
Health check URLs
Build and Deploy
to Production
Validate / Review Health
check URLs
App deployment
complete
Approval
5. 5
Reduced Networking Hardware costs with the use of Lenovo high-performance Ethernet
switches
Simplified underlying Network infrastructure that uses open standards L3 routing protocols
Improved IT productivity with reduced time to deploy resources
Lenovo server environment
A Proof-of-Concept Software-Defined Environment was created to measure and validate the
capabilities of a highly available and highly scalable OpenStack deployment with
Software-Defined Networking and Software-Defined Storage. The hardware management
was accomplished through open source xCAT and Confluent Software.
The software and hardware configuration that was used in this paper is described next.
Cloud installation
The Cloud installation included the following components:
Operating system: Red Hat Enterprise Linux 7.1
OpenStack: Red Hat Enterprise Linux OpenStack Platform 6.0 (Juno)
SDN: Midokura Enterprise MidoNet 1.8.5
SDS: Ceph (Giant) – 0.87.1
Hardware Management: xCAT 2.9.1 with Confluent 1.0
Hardware
The following hardware was used:
Four Lenovo ThinkServer® RD550 controller nodes:
– CPU: 2x Intel Xeon E5-2620 v3
– Memory: 4x 16 GB 2Rx4 PC4-17000R (64 GB)
– Media:
• 4x 4 TB HDD, 7200 RPM (RAID-10 Virtualization)
• 2x 32 GB SD Cards (OS)
– RAID: ThinkServer RAID 720IX Adapter with 2 GB supercapacitor upgrade
– Network:
• 2x Emulex CNA OCe14102-UX 10 Gb Dual port (four ports total) (data)
• Mezzanine Quad RJ45 1 Gb Port (management)
Eight Lenovo ThinkServer RD650 Ceph OSD nodes:
– CPU: Intel Xeon E5-2620 v3
– Memory: 4x 16 GB 2Rx4 PC4-17000R (64 GB)
– Media: 2x 200 GB 12 Gb SAS SSD 3.5-inch (Journal)
– 8x 6 TB HDD, 7200 RPM, 3.5-inch, 6 Gb SAS, hot swap (OSD)
– 2x 32GB SD Cards (OS)
– RAID: ThinkServer RAID 720IX Adapter with 2 GB supercapacitor upgrade
6. 6 Lenovo and Midokura OpenStack PoC: Software-Defined Everything
– Network:
• Emulex CNA OCe14102-UX 10 Gb Dual port (data)
• Mezzanine Quad RJ45 1 Gb Port (management)
16 Lenovo ThinkServer RD550 compute nodes:
– CPU: 2x Intel Xeon E5-2650 v3
– Memory: 24x 16 GB 2Rx4 PC4-17000R (384 GB)
– Media: 2x 32 GB SD Cards (OS)
– Network:
• 2x Emulex CNA OCe14102-UX 10 Gb Dual port (data)
• Mezzanine Quad RJ45 1 Gb Port (management)
Network switch solution
The following network solution was used:
10 GbE: 4x Lenovo RackSwitch™ G8264
1 GbE: 1x Lenovo RackSwitch G8052
One of the goals for this environment was to separate management services for better
manageability and easy migration to alternative hosts. By using this configuration, the
environment was highly available and a potential disaster recovery process can be handled in
much more efficient fashion.
Also, capacity utilization metering is easier to accomplish. To achieve sufficient isolation,
management services were contained in a VM that was running under a KVM hypervisor and
managed by xCAT. These management VMs were customized with the minimum resource
overhead. The selected software platform was RHEL 7.1 with latest OpenStack Juno and
Ceph Giant enhancements. All the redundant components ran in active/active mode.
Virtual machines
The following VMs were used:
Four OpenStack Controllers
Four OpenStack Databases (MariaDB with Galera) (3x active / 1x passive)
Four Network State Databases (MidoNet) (3x active / 1x passive)
Four HA Proxies
Three Ceph monitor nodes
xCAT hardware and service VM manager
7. 7
Figure 2 shows the Proof of Concept environment.
Figure 2 OpenStack PoC environment
Lenovo physical networking design (leaf/spine)
OpenStack deployments depend on a solid physical network infrastructure that can provide
consistent low-latency switching and delivery of data and storage traffic. To meet this need, a
leaf/spine (Clos Network) design was used. By using a leaf/spine design, the infrastructure
can provide massive scale to support over 15,872 servers.
Lenovo 10 GbE and 40 GbE switches were selected for the design because they provide a
reliable, scalable, cost-effective, easy-to-configure, and flexible solution. When a leaf/spine
design is used, there is no need for expensive proprietary switching infrastructure because
the switches need to provide only layer 2 and layer 3 network services.
Physical Node
Switch 10Gb1 Switch 10Gb2
MidoNet GW4
NSD4
CEPHMon3
HAProxy4
MariaDB4
OpenStkCont4
MidoNet GW3
NSD3
CEPHMon2
HAProxy3
MariaDB3
OpenStkCont3
MidoNet GW2
NSD2
CEPHMon1
HAProxy2
MariaDB2
OpenStkCont2
MidoNet GW1
xCAT
HAProxy1
MariaDB1
OpenStkCont1
Switch 1Gb - mgmt
Virtual Machine
RD650s - Storage
NSD1
CEPH1 CEPH2 CEPH3 CEPH4 CEPH5 CEPH6 CEPH7 CEPH8
RD550 - Compute
CPT1 CPT2 CPT3 CPT4 CPT5 CPT6 CPT7 CPT8
CPT9 CPT10 CPT11 CPT12 CPT13 CPT14 CPT15 CPT16
RD550s - Controllers
1GbE mgmt 20GbE data 10GbE bgp
VLAG
8. 8 Lenovo and Midokura OpenStack PoC: Software-Defined Everything
There is also no need for a large chassis switch because the Clos network can scale out to
thousands of servers that use fixed-form, one or two rack unit switches.
In this design, all servers connect to the leaf nodes and the spine nodes provide
interconnects between all of the leaf nodes. Such a design is fully redundant and can survive
the loss of multiple Spine nodes. To facilitate connectivity across the fabric, a Layer 3 routing
protocol was used that offered the benefit of load balancing traffic, redundancy, and increased
bandwidth within the OpenStack environment. For the routing protocol, Open Shortest Path
First (OSPF) was selected because it is an open standard and supported by most switching
equipment.
The use of Virtual Link Aggregation Groups (vLAG), which is a Lenovo Switch feature that
allows for multi-chassis link aggregation, facilitates active-active uplinks of access switches
for server connections. Servers are connected to the vLAG switch pair with the use of Link
Aggregation Control Protocol (LACP). The use of vLAG allows for increased bandwidth to
each server and more network resiliency.
Because MidoNet provides a network overlay that uses VXLAN, there is no need for large
Layer 2 networks. Removing large Layer 2 networks removes the need for large core switches
and the inherent issues of large broadcast domains on physical networks. Also, compute
nodes need only IP connectivity between each other.
Midonet handles tenant isolation by using VXLAN headers. To further enhance network
performance, the design uses Emulex network adapters with hardware VXLAN-offload
capabilities.
The Lenovo-Midokura solution is cost effective and provides a high-speed interconnect
solution and can be modified depending on the customer’s bandwidth requirements. The
fabric that connects the leaf and spine can be modified by using 10 GbE or 40 GbE, which
results in cost savings. It is also possible to use only two spine nodes, but the four post design
increases reliability and provides more bandwidth.
9. 9
Figure 3 shows the Leaf/Spine vLAG network topology.
Figure 3 OpenStack network topology that uses Lenovo Leaf/Spine vLAG
OpenStack environment
Red Hat Enterprise Linux OpenStack Platform 6 was selected for this project because of
enterprise class support that the vendor provides to meet customer demands. However,
instead of the use of Red Hat OpenStack installation mechanisms (Foreman), the solution
was implemented by using a manual process for better customization while automating it by
using the xCAT tool. In doing so, the solution benefits from all the premium features of Red
Hat OpenStack solutions, but has more control over each component that is installed and
handled by the system.
To prove the scalability factor of OpenStack, four redundant and active VMs were created to
handle the following OpenStack Management Services:
Keystone
Glance
Cinder
Nova
Horizon
Neutron with the MidoNet plugin
Heat
Ceilometer
For the database engine and message broker, four instances of MariaDB with Galera were
clustered and RabbitMQ was selected to meet scalability and redundancy needs.
CEPH Storage
Compute Controller
Spine
Leaf
Lenovo RD650 Servers
Single CPU 2620 – 6 Core
48 TB of RAW Storage
64GB RAM
Lenovo RD550 Servers
Dual CPU 2650 – 20 Core
2.4 TB of RAW Storage
384GB RAM
Lenovo RD650 Servers
Dual CPU 2650 – 20 Core
16 TB of RAW Storage
64GB RAM
Storage
L3 ECMP
OSPF
VXLAN
10 Gb
40 Gb
Emulex
VXLAN
Offload
VXLAN
Providing
Connection
to Multiple
Tenants
Lenovo G8332
L2 / L3 Switches
32 Ports of 40G or
128 Ports of 10G
vLAG vLAG vLAG
VXLAN Gateway
Lenovo G8264
L2 / L3 Switches
48 Ports 10G
4 Ports 40G
Emulex
VXLAN
Offload
Emulex
VXLAN
Offload
Lenovo RD550
Dual CPU 2650 – 20
Cores, 16 TB of RAW
Storage, 64GB RAM
LACP Port
Channel
LACP Port
ChannelLACP Port
Channel
Redundant Routed
Interfaces Providing
BGP Peering between
the Leaf and midonet
10. 10 Lenovo and Midokura OpenStack PoC: Software-Defined Everything
Each Service VM was placed on separate hardware with ability to migrate over to another
KVM host if there was a hardware failure. The load balancing between management services
was accomplished with help of four redundant HAproxy VMs with the Keepalive Virtual IP
implemented to create single point of entry for the user.
For the MidoNet Network State Database (NSD) redundancy and to maintain consistency
with the rest of the environment, four instances of the Apache ZooKeeper/Cassandra
databases were created. For reference, it is recommended to use an odd number of
ZooKeeper/Cassandra nodes in the environment for quorum.
To avoid a split brain issue, the database systems were placed in an odd number of active
nodes and remaining node in a passive state.
Memcached daemon was used to address the known nova-consoleauth service scale
limitation and handling tokens with multiple users attempting to access VNC services with
Horizon.
Extensive tests of multiple failing components were performed; entire nodes and even the
entire cluster was brought down to verify the HA, which confirmed that disaster recovery can
be accomplished in relatively quick fashion.
This OpenStack solution was built fully redundant with no single point of failure and ready to
manage large amounts of compute resources. The manual installation approach with xCAT
automation allows for rapid and nondisruptive scaling deployment in the environment.
The separation of the management services in the VMs provides the ability to better monitor
and capacity-plan the infrastructure and easily move resources on demand to dedicated
hardware.
This OpenStack environment meets all the demands of production-grade, highly scalable,
swiftly deployable private clouds.
Automating OpenStack deployment and hardware management
To better manage the cloud infrastructure from a hardware and software perspective, the
open source project - xCAT1
was used with the addition of Confluent.
xCAT offers complete management for HPC clusters, render farms, Grids, web farms, online
gaming infrastructure, clouds, data centers, and complex infrastructure setups. It is agile,
extensible, and based on years of system administration best practices and experience. It is a
perfect fit for custom OpenStack deployments, including this reference architecture. It also
allows for bare-metal deployment and handles post-operating system installation, automation,
and hardware and VM monitoring mechanisms.
xCAT manages infrastructure by setting up and interacting with IPMI/BMC component on the
hardware level. It also uses Serial-over-LAN for each machine to access consoles without the
need of a functional network layer.
1
For more information, see this website:
http://sourceforge.net/p/xcat/wiki/Main_Page
11. 11
For the Service VM layer of the solution, xCAT connects to the virsh interface and SOL, so
that managing VM infrastructure is as easy as managing hardware. Moreover, xCAT can read
sensor information and gather inventories directly from the hardware, which allows identifying
hardware failures quickly and easily.
The ability to push firmware updates in an automated fashion by using a built-in update tool
helps maintain hardware features and fixes. These features make the management of
hardware and software much simpler by creating a single stop shop approach for any
management tasks.
Ultimately, xCAT was set up to manage the following tasks:
Customize and deploy operating system images to all required type of nodes (Ceph,
Service VM Controller, Compute, OpenStack, MariaDB, Cassandra/ZooKeeper, and
HAProxy)
Customize and deploy postinstallation scripts that define the software infrastructure
Identify hardware issues with hardware monitoring
Identify software issues with a parallel shell mechanism
Update firmware for all hardware components
Provide Serial-Over-LAN connectivity for bare-metal operating system and VM operating
system
Automate expansion of the cloud or node-replacement
Provide DHCP/DNS/NAT/FTP/HTTP services to the infrastructure nodes.
Provide local package repository (rpm) for all required software, including RHEL 7.1,
Ceph, MidoNet, and Epel
Provide a simple Web User Interface (Confluent) for quick overview of hardware health
and SOL console access
12. 12 Lenovo and Midokura OpenStack PoC: Software-Defined Everything
Software-Defined Storage: Ceph
Cloud and enterprise organizations’ data needs grow exponentially and the classic enterprise
storage solutions do not suffice to meet the demand in a cost-effective manner. Moreover, the
refresh cycle of the legacy storage hardware lags behind x86 commodity hardware. The
viable answer to this problem is the emerging Software-Defined Storage (SDS) approach.
One of the leading SDS solutions, Ceph provides scale-out software that runs on commodity
hardware with the latest performance hardware and the ability to handle exabytes of storage.
Ceph is highly reliable, self-healing, easy to manage, and open source.
The POC environment uses eight dedicated Ceph nodes with over 300 TB of raw storage.
Each storage node is populated with 8x 6TB HDD for OSD and 2x 200 GB supporting SSDs
for journaling. Ceph can be configured with spindle drives only; however, because of
journaling devices performing random reads and writes, it is recommended to use SSDs to
decrease access time and read latency while accelerating throughput. Performance tests on
configurations with SSDs enabled and disabled showed an increase of IOPs by more than
50% with the SSD enabled for journaling.
To save disk cycles from operating system activities, RHEL7.1 was loaded to dual on-board
SD cards (Lenovo ThinkSErver option) to all nodes, including Ceph OSD nodes. Dual-card,
USB3-based reader with class 10 SD cards allowed for enough local storage and speed to
load the operating system and all necessary components without sacrificing the performance.
Software RAID level 1 (mirroring) was used for local redundancy.
Ceph storage availability to compute hosts depends on the Ethernet network; therefore,
ensuring maximum throughput and minimum latency must be established on the internal,
underlying network infrastructure. For best results, dual 10GbE Emulex links were aggregated
by using the OVS LACP protocol with balance-tcp hashing algorithm. Lenovo’s vLAG
functionality on the top-of-rack (TOR) switches allows for full 20Gb connectivity between the
Ceph nodes for storage rebalancing.
All the compute nodes and OpenStack Controller nodes used Linux bond mode 4 (LACP).
The aggregated links were VLAN-trunked for client and the cluster network access. Quick
performance tests showed Ceph’s ability to use aggregated links to the full extent, especially
with read operations to multiple hosts.
14. 14 Lenovo and Midokura OpenStack PoC: Software-Defined Everything
Ceph global configuration is shown in Figure 5
Figure 5 Ceph reference deployment
Ceph storage was used in multiple OpenStack Services, including Glance for storing images,
Cinder for Block Storage usage, and Volume creation and Nova for creating VMs that are
directly on the Ceph volumes.
Software-Defined Networking: MidoNet
MidoNet is an open source software solution that enables agile cloud networking via Network
Virtualization Overlays (NVO). As a software play, MidoNet enables the DevOps and CI
movement by providing network agility through its distributed architecture. When paired with
OpenStack as a Neutron plugin, MidoNet allows tenants to create logical topologies via virtual
routers, networks, security groups, NAT, and load balancing, all of which are created
dynamically and implemented with tenant isolation over shared infrastructure.
MidoNet provides the following networking functions:
Fully distributed architecture with no single points of failure
Virtual L2 distributed isolation and switching with none of the limitations of conventional
VLANs
Virtual L3 distributed routing
Distributed Load Balancing and Firewall services
Stateful and stateless NAT
Access Control Lists (ACLs)
RESTful API
Full Tenant isolation
Monitoring of networking services
VXLAN and GRE support: Tunnel zones and Gateways
Zero-delay NAT connection tracking
[global]
fsid = cce8c4ea-2efd-408f-845e-87707d26b99a
mon_initial_members = cephmon1, cephmon2, cephmon3
mon_host = 192.168.0.20,192.168.0.21,192.168.0.22
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
osd_pool_default_size = 3
osd_pool_default_pg_num = 4096
osd_pool_default_pgp_num = 4096
public_network = 192.168.0.0/24
cluster_network = 192.168.1.0/24
[client]
rbd cache = true
15. 15
MidoNet features a Neutron plugin for OpenStack. MidoNet agents run at the edge of the
network on compute and gateway hosts. These datapath hosts (where the MidoNet agents
are installed), require only IP connectivity between them and must permit VXLAN or GRE
tunnels to pass VM data traffic (maximum transmission unit [MTU] considerations).
Configuration management is provided via a RESTful API server. The API server can typically
be co-located with the neutron-server on OpenStack controllers. The API is stateless and can
be accessed via the MidoNet CLI client or the MidoNet Manager GUI.
Logical topologies and virtual networks devices that are created via the API are stored in the
Network State Database (NSDB). The NSDB consists of ZooKeeper and Cassandra for
logical topology storage. These services can be co-located and deployed in quorum for
resiliency.
For more information about the MidoNet Network Models, see the “Overview” blogs that are
available at this website:
http://blog.midonet.org
Figure 6 shows the MidoNet Reference Architecture.
Figure 6 MidoNet Reference Architecture
MidoNet achieves L2 - L4 network services in a single virtual hop at the edge, as traffic enters
the OpenStack cloud via the gateway nodes or VMs on compute hosts. There is no reliance
on a particular service appliance nor service node for a particular network function, which
removes bottlenecks in the network and allows the ability to scale. This architecture is a great
advantage for production-ready clouds over alternative solutions.
Compute Node 2
Compute Node 1
Compute Node 1
Compute Node X
Management Network 192.168.0.0/24
UpstreamConnectivity
10.0.Y.0/30
UpstreamConnectivity
10.0.1.0/30
Datapath Network 172.16.0.0/24
NSDB 1
ZK/Cass
NSDB 2
ZK/Cass
NSDB 3
ZK/Cass
OpenStack Controller Services
Horizon & MidoNet Manager
Neutron & API servers
External Networks
& Upstream BGP Routers
MidoNet Gateway 1 MidoNet Gateway Y
16. 16 Lenovo and Midokura OpenStack PoC: Software-Defined Everything
Table 1 shows a comparison between the MidoNet and Open vSwitch Neutron plugin.
Table 1 MidoNet and OVS Neutron plugin comparison
Features MidoNet OVS
Open Source Yes Yes
Hypervisors Supported KVM, ESXi, Xen, Hyper-V (Planned) KVM, Xen
Containers Docker Docker
Orchestration Tools OpenStack, oVirt, RHEV, Docker, Custom,
vSphere, Mesos (Planned)
OpenStack, oVirt, openQRM, openNebula
L2 BUM traffic Yes Default: Send bcast to every host even if
they do not use the corresponding
network. Send to partial-mesh over
unicast tunnels requires enabling extra
l2population mechanism driver.
Distributed Layer 3 Gateway Scales to 100s, no limitations when
enabled
Default deployment: Intrinsic architectural
issue with SPOF Neutron Network Node
for routing and higher layer network
services. Does not scale well. Early stage
DVR requires installing an extra agent
(L3-agent) on compute hosts and still
relies on network node for non-distributed
SNAT. Currently, DVR cannot be
combined with L3HA/VRRP.
SNAT Yes Not distributed, requires iptables
(poor scale)
VLAN Gateway Yes Yes
VXLAN Gateway Yes L3 HA: Requires keepalived, which uses
VRRP internally (active-standby
implications);
DVR: Requires external connectivity on
each host (security implications)
HW VTEP L2 Gateway Yes Yes
Distributed Layer 4 Load
Balancer
Yes Relies on another driver (HAProxy)
Supports spanning multiple
environments
Yes No
GUI-based configuration Yes No
GUI-based monitoring Yes No
GUI-based flow tracing Yes No
Pricing OSS: Free.
MEM: $1899 USD per host (any number of
sockets), including 24x7 support standard.
Free: No Support Option.
17. 17
In the proof of concept lab, well-capable servers were used and thus some MidoNet
components were co-located. On the four OpenStack Controller nodes, the MidoNet Agents
were installed on bare-metal operating systems to provide Gateway node functionality by
terminating VXLAN tunnels from the OpenStack environment for external access via the
Border Gateway Protocol (BGP) and Equal-Cost Multi-Path routing (ECMP).
Next, a Service VM on each of the four Controller nodes was created for the Network State
Database (consisting of ZooKeeper and Cassandra). These projects require deployment in
quorum (3, 5, 7,…) to sustain themselves if there are N failures. In this POC, the failure
acceptance is equivalent to that achieved by three ZooKeeper/Cassandra nodes in a cluster.
Each of the OpenStack Controller Service VMs was created to serve the main OpenStack
controller functions. Within these Service VMs, the MidoNet API (stateless) server and
MidoNet Manager files (web files to serve up the client-side application) were installed.
MidoNet Manager is part of the Midokura Enterprise MidoNet (MEM) subscription (bundled
with support) and provides a GUI for configuring and maintaining virtual networks in an
OpenStack + MidoNet environment.
Other network-related packages that were installed on the OpenStack Controller Service VMs
include the neutron-server and metadata-agent. Because of the metadata-agent proxy’s
dependency on DHCP namespaces, the dhcp-agent was also installed on the OpenStack
Controller Service VM despite MidoNet's distributed DHCP service. These specific services
were load-balanced by using HAProxy.
Finally, the MidoNet agent is installed on the Compute nodes for providing VMs with virtual
networking services. Because the MidoNet agent uses local compute power to make all L2-
L4 networking decisions, MidoNet provides the ability to scale. As the number of Compute
nodes grow, so does the networking compute power.
Configurations
The Gateway nodes provide external connectivity for the OpenStack + MidoNet cloud. BGP
was implemented between the Gateway nodes and the Lenovo Top of Rack switches for its
dynamic routing capabilities.
To exhibit fast failover, the BGP timers were shortened. These settings can easily be adjusted
based on the needs of the users. In this lab, the parameters that are shown in Figure 7 were
modified in the /etc/midolman/midolman.conf file.
Figure 7 BGP parameters in midolman.conf on MidoNet Gateway Nodes
These parameters provide a maximum of 15 seconds for failover if the BGP peering session
goes down on a Gateway node.
# bgpd
bgp_connect_retry=10
bgp_holdtime=15
bgp_keepalive=5
18. 18 Lenovo and Midokura OpenStack PoC: Software-Defined Everything
The gateways must have Large Receive Offload (LRO) turned off to ensure MidoNet delivers
packets that are not larger than the MTU of the destination VM. For example, the command
that is shown in Figure 8 turns off LRO for an uplink interface of a gateway.
Figure 8 Disabling LRO on MidoNet Gateway Nodes
Also, to share state, port groups were created for gateway uplinks. Stateful port-groups allow
the state of a connection to be shared such that gateways can track connections with
asymmetric traffic flows. Figure 9 shows the commands that are used to configure stateful
port-groups.
Figure 9 Configuring stateful port-groups
The default number of client connections for ZooKeeper was changed on the NSDB nodes.
This change is made in the /etc/zookeeper/zoo.cfg file by using the line that is shown in
Figure 10.
Figure 10 Increasing the number of client connections for ZooKeeper instances
This configuration change allows the number of MidoNet agents that are connecting to
ZooKeeper to go beyond the default limit.
Logical routers and rules and chains were also created to provide multi-VRF functionality for
upstream isolation of traffic.
Operational tools
MidoNet Manager is a network management GUI that provides an interface for operating
networks in an OpenStack + MidoNet cloud. It allows the configuration of BGP for gateway
functionality and monitoring of all virtual devices through traffic flow graphs.
When VXLAN overlays are used with OpenStack, operating and monitoring tools become
increasingly relevant when you are moving from proof-of-concept into production. Preceding
monitoring and troubleshooting methods (such as RSPAN) capture packets on physical
switches but give no context for a traffic flow.
MidoNet Manager presents flow tracing tools in a GUI to give OpenStack + MidoNet cloud
operators the ability to identify specific tenant traffic and trace their flow through a logical
topology. The flow tracing gives insight into each virtual network device that is traversed,
every security group policy that is applied, and the final fate of the packet. MidoNet Manager
provides insights for NetOps and DevOps for the Operations and Monitoring of OpenStack +
MidoNet environments that are built for enterprise private clouds.
# ethtool -K p2p1 lro off
midonet-cli> port-group create name SPG stateful true
pgroup0
midonet> port-group pgroup0 add member port router0:port0
port-group pgroup0 port router0:port0
midonet> port-group pgroup0 add member port router0:port1
port-group pgroup0 port router0:port1
maxClientCnxns=500
19. 19
An example of the initial stage of flow tracing in MidoNet Manager is highlighted in the red box
in Figure 11.
Figure 11 MidoNet Manager flow tracing
20. 20 Lenovo and Midokura OpenStack PoC: Software-Defined Everything
Professional services
The Lenovo Enterprise Solution Services team helps clients worldwide with deployment of
Lenovo System x® and ThinkServer solutions and technologies. The Enterprise Solution
Services team can design and deliver the OpenStack Cloud solution that is described in this
document and new designs in Software Defined Everything, big data and analytics, HPC,
Virtualization, or Converged Infrastructure. Lenovo Enterprise Solution Services also
provides training to a staff at site to get up to speed with performing health check services for
existing environments.
We feature the following offerings:
Cloud: Our cloud experts help design complex IaaS, PaaS, or SaaS cloud solutions with
our Cloud Design Workshop. We specialize in OpenStack and VMware-based private and
hybrid cloud, design, and implementation services.
Software-Defined Storage: We provide expertise with design and implementation servers
for software-defined storage environments. Our consultants can provide assistance with
implementing Ceph, Quobyte, General Parallel File System (GPFS), and GPFS storage
server installation and configuration of key operating system and software components or
with other software-defined storage technologies.
Virtualization: Get assistance with VMware vSphere or Linux KVM through our design
implementation and health check services.
Converged Infrastructure: Learn about Flex system virtualized, Blade server to Flex
system migration assessment, VMware-based private cloud, and Flex System™ manager
quickstart.
High-Performance Computing (HPC): Our team helps you get the most out of your
System x or ThinkServer with HPC intelligent cluster implementation services, health
check services, and state-of-the-art cloud services for HPC.
For more information, contact Lenovo Enterprise Solution Services at: x86svcs@lenovo.com
The Midokura team provides professional services and training to enable customers with
OpenStack and MidoNet. Midokura’s expertise is in distributed systems. The Midokura team
has real-world experience building distributed systems for large e-commerce sites, such as
Amazon and Google.
Midokura Professional Services helps customers from architectural design, implementation
into production, and MidoNet training, in only a couple of weeks. Midokura Professional
Services are not only academic; the solutions are practical and come from hands-on
deployments, operational experience, and direct contributions to the OpenStack Neutron
project.
For more information, contact Midokura at: info@midokura.com
21. 21
About the authors
Krzysztof (Chris) Janiszewski is a member of Enterprise Solution Services team at Lenovo.
His main background is in designing, developing, and administering multiplatform, clustered,
software-defined, and cloud environments. Chris previously led System Test efforts for the
IBM OpenStack cloud-based solution for x86, IBM System z, and IBM Power platforms.
Michael Lea, CCIE #11662, CISSP, MBA, is an Enterprise System Engineer for Lenovo’s
Enterprise Business Group. He has over 18 years of experience in the field designing
customers networks and data centers. Over the past 18 years, Michael worked with service
providers, managed service providers (MSPs), and large enterprises, delivering cost effective
solutions that include networking, data center, and security assurance. By always looking at
technical and business requirements, Michael makes certain that the proper technologies are
used to help clients meet their business objectives. Previous roles held by Michael include
Consulting Systems Engineer with Cisco Systems and IBM.
Cynthia Thomas Cynthia is a Systems Engineer at Midokura. Her background in networking
spans Data Center, Telecommunications, and Campus/Enterprise solutions. Cynthia has
earned a number of professional certifications, including: Alcatel-Lucent Network Routing
Specialist II (NRS II) written certification exams, Brocade Certified Ethernet Fabric
Professional (BCEFP), Brocade Certified IP Network Professional (BCNP), and VMware
Technical Sales Professional (VTSP) 5 certifications.
Susan Wu is the Director of Technical Marketing at Midokura. Susan previously led product
positions for Oracle/Sun, Citrix, AMD, and Docker. She is a frequent speaker for industry
conferences, such as Interzone, Cloudcon/Data360, and Data Storage Innovation.
Thanks to the following people for their contribution to this project:
Srihari Angaluri, Lenovo
Michael Ford, Midokura
Adam Johnson, Midokura
David Watts, Lenovo Press
23. 23
This document REDP-5233-00 was created or updated on June 25, 2015.
Send us your comments in one of the following ways:
Use the online Contact us review Redbooks form found at:
ibm.com/redbooks
Send your comments in an email to:
redbooks@us.ibm.com
Trademarks
Lenovo, the Lenovo logo, and For Those Who Do are trademarks or registered trademarks of Lenovo in the
United States, other countries, or both. These and other Lenovo trademarked terms are marked on their first
occurrence in this information with the appropriate symbol (® or ™), indicating US registered or common law
trademarks owned by Lenovo at the time this information was published. Such trademarks may also be
registered or common law trademarks in other countries. A current list of Lenovo trademarks is available on
the Web at http://www.lenovo.com/legal/copytrade.html.
The following terms are trademarks of Lenovo in the United States, other countries, or both:
Flex System™
Lenovo®
RackSwitch™
Lenovo(logo)®
System x®
ThinkServer®
The following terms are trademarks of other companies:
Intel, Intel Xeon, Intel logo, Intel Inside logo, and Intel Centrino logo are trademarks or registered trademarks
of Intel Corporation or its subsidiaries in the United States and other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.