1. EYWA: Elastic load-balancing & high-availabilitY Wired
virtual network Architecture
Wookjae Jeong
wjjung11@gmail.com
Jungin Jung
call518@gmail.com
ABSTRACT
Infrastructure as a Service (IaaS) for cloud environments provides
compute processing, storage, networks, and other fundamental
computing resources. To support multi-tenant cloud environments,
IaaS utilizes the various advantages of the virtualization, but con-
ventional virtual (overlay) network architectures for IaaS have
been a direct cause of scalability limitations in multi-tenant cloud
environments. In other words, IaaSâs virtual networks have the
limitations due to the problems of high availability and load bal-
ancing, etc. To solve these problems, we present EYWA, a virtual
network architecture that scales to support huge data centers with
high availability, load balancing and large layer-2 semantics. The
design of EYWA overcomes the limitations by accommodating
(1)a large number of tenants (about 224
= 16,777,216) by using
virtual LANs such as logically isolated network with its own IP
range in the cloud service providersâ view, and providing
(2)public network service per tenant without throughput bottle-
neck and single point of failure (SPOF) on Source and Destina-
tion Network Address Translation (SNAT/DNAT) and (3)a single
large IP subnet per tenant by using large layer-2 semantics in the
consumersâ view. EYWA combines existing techniques into a
decentralized scale-out control and data plane. The only compo-
nent of EYWA is an agent in every hypervisor host that can con-
trol packets and the agents act as distributed controller. As a result,
EYWA can be deployed into all the multi-tenant cloud environ-
ments today. We have implemented POC and evaluated the ad-
vantages of EYWA design using measurement, analysis and ex-
periments on our lab.
Categories and Subject Descriptors
C.2.1 [Network Architecture and Design]: Distributed networks,
Network topology
General Terms
Design, Performance, Reliability
Keywords
Cloud, IaaS, Virtual network, Overlay network, Data center net-
work, OpenStack, Load balancing, Load sharing
1. INTRODUCTION
After cloud computing emerged, the various service providers
[20, 22, 24, 25, 26, 27] and solutions [16, 17, 18, 19, 21, 23] for
IaaS have appeared, and the various technical limitations of IaaS
has been improved and developed by them. The physical (under-
lay) network architectures for data centers in cloud environments
have been researched [1, 2, 3, 8, 9] and standardized [13, 14] to
solve the problems of the conventional three-tier model and find
optimal network models, whereas the virtual network architec-
tures for cloud environments still have various problems and are
slow in progress up to now so we regret that there no various
outcomes up to the field of physical networks.
It became possible that cloud computing environments provide
more of Virtual Machines (VM) more quickly at low cost result-
ing from the increased computing power and decreased price of
devices, but the fundamental solutions of how to connect a large
number of VMs to the networks without throughput bottleneck
and SPOF while ensuring security and traffic isolation have not
yet appeared.
2. BACKGOUND
In this section, we first explain the models and dominant issues
for todayâs virtual network architectures. Typically, when a user
creates VMs in multi-tenant cloud environments, the correspond-
ing tenant is assigned a virtual LAN for private network, a Virtual
Router (VR) and a public IP address for public network, and each
VM is assigned a private IP address. Additionally, the VR can
also be assigned public IP addresses for each VM. Looking in
detail, VMâs private IP address is assigned directly to VMâs net-
work interface but public IP addresses for tenant or VMs are as-
signed indirectly to VR such as VM Instance or Linux Network
Namespaces and VMs communicate with external networks
through the VR, that is, VMâs outbound traffic flows using SNAT
of the VR set as default gateway inside the VM, and inbound
traffic flows using DNAT of the VR. Figure 1 [16, 18] and Figure
2 [17] show how public traffic flows through the VRs in multi-
tenant cloud environments using a shared network service and
hypervisor network service model. Inter-VM communication in-
side a virtual LAN communicates directly using layer-2 protocols
or indirectly through a centralized shared network service host.
This just moves the conventional physical network architecture
to the virtual environments. As a result, the virtual networks have
the same problems of scalability limitations such as high availabil-
ity and load balancing as the physical networks.
Public Network has the following problems because VMs
communicate with external networks through SNAT/DNAT of a
single VR.
ďŹ High Availability: Due to a shared network service host (e.g.,
OpenStack/Nova Network/Network Node), SPOF exists as il-
lustrated in Figure 1. In the worst case, all the VMs in the
cluster (not tenant) cannot communicate with external net-
works if the shared network service host fails. In comparison,
a hypervisor network service model limits to failure domain
per tenant as illustrated in Figure 2, and in the worst case, on-
ly all the VMs in the corresponding tenant (not cluster) cannot
communicates with external networks if the hypervisor host
running the VR fails. To solve these problems, high availabil-
ity structures using protocols such as VRRP [12], etc., have
been proposed, but Active-Standby or Active-Active structure
2. with one or two more VRs or network service hosts cannot be
a perfect solution for high availability.
ďŹ Load Sharing and Balancing: A single VR or shared net-
work service host should exist as throughput bottleneck on
SNAT/DNAT and layer-4 load balancing. Eventually, due to
performance and bandwidth limitations of the single VR or
host, the number of VMs should be limited even if a large
number of VMs can be provided in a single virtual LAN. One
of these problems is layer-4 load balancing to support scale-
out public services such as web service. In order to improve
the load balancing, some IaaS services [27] provide additional
physical load balancers (scale-up) instead of the virtual in-
stances and others provide an additional load balancing ser-
vice such as Amazon Web Services (AWS) Elastic Load Bal-
ancing (ELB) [30], but all of them cannot provide unlimited
scalability.
ďŹ Traffic Engineering: In a hypervisor network service model,
most of VMsâ public traffic as blue-line illustrated in Figure 2
consumes additional network bandwidth and have increased
latency to traverse a single remote VR except for several
VMsâ traffic as green-line in the same hypervisor host as a
single VR. In a shared network service model, all the public
traffic consumes additional bandwidth and has increased la-
tency to traverse a remote network service host as illustrated
in Figure 1.
Private Network is provided because a cloud tenant demands
VMs to be in a different layer-2 subnet or layer-3 network from
others in multi-tenant cloud environments. A virtual LAN per
tenant has mainly been provided by using VLAN (802.1Q) [11].
ďŹ VLAN (802.1Q) limit: A cloud data center can quickly ex-
ceed the VLAN ID limit of 4,094 with enough top-of-rack
switches connected to multiple physical servers hosting VMs,
each belonging to at least one virtual LAN. This hinders ten-
ant expansion, layer-2 restrictions for VM communications
and VM mobility.
ďŹ A single large IP subnet (large layer-2 network): In order
to take full advantage of layer-2 communication, a large num-
ber of VMs should be able to be deployed in a single virtual
LAN, but layer-2 issues such as Address Resolution Protocol
(ARP) broadcast, MAC flooding and Spanning Tree Protocol
(STP) should be resolved first.
3. EYWA
We propose EYWA for the final architecture as illustrated in
Figure 3 that accommodates up to a large number of tenants, and
provides public network per tenant without throughput bottleneck
& SPOF and a single large IP subnet (private network) per tenant
by eliminating all of the issues described in Section 2.
Prior to the description of this design, the VR in EYWA envi-
ronments is that SNAT/DNAT and layer-4 load balancer are inte-
grated in a single VM instance. This design is based on the hyper-
visor network service model, and assumes that utilizes DNS-based
load balancing such as AWS Route 53 [31] like AWS ELB [30]
in addition to the VRâs layer-4 load balancer. Finally, each VM
may or may not have a local VR according to policy or VR failure
and therefore EYWA defines two modes as illustrated in Figure 3.
One is the Normal Mode if a VR exists with a VM in a same
internal Virtual Switch (VSi, virtual software switch), that is, the
default gateway VR for a VM exists with the VM in a same hy-
pervisor host. The other is the Orphan Mode if a VR does not
exist with a VM in a same VSi, that is, the default gateway VR for
a VM does not exist with the VM in a same hypervisor host.
Figure 1: Traffic Flows in a shared network service host
Figure 2: Traffic Flows in hypervisor network service hosts
3.1 Public Network
Each VM in a tenant can have a large number of physically dif-
ferent VR instances as a default gateway that have a same private
IP address (e.g., 10.0.0.1) as illustrated in Figure 3.
ďŹ High Availability: Active-Active structure by using the mul-
tiple VRs per tenant eliminates SPOF and meets high availa-
bility. The VRs have a same private IP address and can be
scaled out with the unlimited instances.
ďŹ Load Sharing and Balancing: VMsâ default gateways are
distributed to the unlimited VRs according to policy such as
latency and performance and therefore the throughput bottle-
neck of outbound traffic disappears. Inbound traffic is also
distributed by the software load balancers inside the unlimited
VRs and external DNS-based load balancing.
ďŹ Traffic Engineering: The VMs having a local VR obtain the
effect of reducing network bandwidth and latency because the
VMs have the local VR as default gateway if a local VR exist.
3.2 Private Network
Virtual Extensible LAN (VxLAN) [5]: EYWA will provide a
large number of virtual LANs (about 224
= 16,777,216) using
VxLAN. It is another overlay networking solution that can elimin-
3. Figure 3: Agent and Mode in EYWA environments
ate VLAN ID limit.
ďŹ Resources: IP address is also resource. Each VR does not
consume the private IP address pool.
ďŹ STP only supports 200 to 300 VMs in a virtual LAN and
STPâs requirement for a single active path between switches
also limits the ability to provide multi-pathing and ensure
networking resiliency. EYWA has a simple network fabric
without multi-path to take advantage of the virtualization such
as fault-tolerance so all the software switchesâ STP option is
disabled and there is no need any more to consider the issue.
ďŹ MAC flooding: When MAC flooding occurs, the switch will
flood all ports with incoming traffic because it cannot find the
port for a particular MAC address in the MAC table. The
switch, in essence, acts like a hub. In VLAN, VMsâ MAC ad-
dresses consume the limited memory set aside in the physical
switch to store the MAC addresses, whereas VMsâ MAC ad-
dresses do not consume the limited memory in VxLAN be-
cause the address is encapsulated by the host address.
ďŹ ARP broadcast: The agents act as distributed proxy ARP so
ARP broadcasts are severely depleted.
3.3 VM Migrations
There is also the need to migrate VM to another host to opti-
mize usage of the underlying physical server hardware and reduce
energy costs. EYWA has the most significant advantage of live-
migrations. For example, letâs assume that a VM live migrates to
another hypervisor host because of the VR or hypervisor hostâs
overload. A default gateway IP address set inside the VM is diffi-
cult to change during operation and therefore the migrated VM
will continue to use the default gateway VR as a default gateway
in most environments. In EYWA environments, there is also no
change in the migrated VMâs default gateway IP address, but the
VM can use a physically different and underloaded VR as a de-
fault gateway as illustrated in Figure 4 and therefore this get the
advantage that reduce network bandwidth and latency. All the
advantages are additionally obtained by the agentâs packet control
that allows the unlimited VRs exist simultaneously with a same
private IP address.
3.4 EYWA Agent
To achieve these advantages, an agent in every hypervisor host
monitors each tenantâs Virtual Router Port (vport) and VxLAN
Figure 4: SNAT Traffic Flows after VM live-migration
Tunnel End Point (VTEP) in the hypervisor host as illustrated in
Figure 3. The agent does not communicate with any servers or any
agents, and only operates independently.
3.4.1 VR Monitoring
The agent monitors the vport to check the local VRâs state &
bandwidth and performs health check on the local VR through the
vport connected with VSi, that is, the agent can check the local
VRâs state by monitoring ARP sessions & Gratuitous ARP
(GARP) and performing periodic health check. The local VRâs
state can be also determined passively by looking at the VRâs
ARP reply to a VMâs ARP request. Through this, the agent de-
termines to be the Normal Mode, if the local VR is running nor-
mally, and to be the Orphan Mode if not. Finally, the agent ac-
quires the QoS information by monitoring the local VRâs band-
width usage through the vport.
3.4.2 ARP Caching
An agent provides ARP cache for an IP address to the MAC ad-
dress mapping, as it will use the Proxy ARP function until ARP
cache times out. For this purpose, the agent have to store the local
VR, VMs and remote VMsâ addresses in the ARP cache by moni-
toring all the ARP sessions & GARP packets through the vport
and VTEP, and send ARP request to all the local VR and VMs
especially at regular intervals, except for remote VMs, before
cache time out. The ARP cache has to be updated consistently.
3.4.3 ARP Filtering & Proxy ARP
As explained above, a large number of VRs per tenant have a s-
4. Table 1: ARP Packet Control Rules on VTEPs
Sender IP
address
Target IP
address
Normal Mode Orphan Mode
Outbound Inbound Outbound Inbound
ARP Request
VR(10.0.0.1) VM
1-1Pass
1-2Filtering
1-3Proxy
2Filtering 3N/A
4-1Filtering
4-2Proxy
VM VR(10.0.0.1) 5Filtering
6-1Filtering
6-2Proxy
7Pass 8Filtering
ARP Reply
(unicast)
VR(10.0.0.1) VM 9N/A 10N/A 11N/A 12Pass & Filtering
VM VR(10.0.0.1) 13N/A 14Pass 15N/A 16N/A
GARP VR(10.0.0.1) VR(10.0.0.1) 17Filtering 18N/A 19N/A 20N/A
ARP Request VM VM
21-1Pass
21-2Filtering
21-3Proxy
22-1Filtering
22-2Proxy
23-1Pass
23-2Filtering
23-3Proxy
24-1Filtering
24-2Proxy
ame private IP address and therefore the agent should filter ARP
packets through the VTEP to allow VMs to discover a single
gateway VR and prevent IP address conflicts between the multiple
VRs, and acts as Proxy ARP through the VTEP to reduce ARP
broadcasts, but does not take care of ARP broadcasts between the
local instances such as the VR and VMs in the same VSi.
If the Normal Mode, the local VR is assigned as the VMsâ de-
fault gateway, and if the Orphan Mode, a fastest and underloaded
remote VR is assigned as the VMâs default gateway. For these
reasons, all the agents control ARP packets through the VTEP
according to ARP Packet Control Rules as illustrated in Table 1.
The detailed descriptions are as follows:
1. If the Normal Mode & Outbound ARP request (Sender VR->
Target VM), this packet is an outgoing ARP request from the
local host for a local VR to discover a local or remote orphan
VM, and therefore if the Target IP address is a local VM, 1-
2Filtering to prevent ARP broadcast because the local VM
sends ARP reply in person. If the Target IP address is a remote
VM, 1-3Proxy to prevent ARP broadcast or 1-1Pass as the
presence or absence of the MAC entry in the agentâs ARP cache.
2. If the Normal Mode & Inbound ARP request (Sender VR->
Target VM), this is an incoming ARP request (flowed by 1-
1Pass of a remote host) from a remote host for a remote VR to
discover an remote orphan VM. 2Filtering because ARP broad-
cast must be prevented and local VMs should not be visible to
remote VRs in the Normal Mode.
3. If the Orphan Mode & Outbound ARP request (Sender VR->
Target VM), this is an ARP request for a local VR to discover a
local or remote orphan VM, but 3N/A because there is not a lo-
cal VR in the Orphan Mode.
4. If the Orphan Mode & Inbound ARP request (Server VR-
Target VM), this is an ARP request (flowed by 1-1Pass of a re-
mote host) for a remote VR to discover an orphan VM. If the
Target IP address is a local VM, 4-2Proxy to prevent ARP
broadcast from spreading inside. If not, 4-1Filtering.
5. If the Normal Mode & Outbound ARP request (Sender VM->
Target VR), this is an ARP request for a local VM to discover a
local VR. 5Filtering because remote VRs should not be visible
to local VMs in the Normal Mode and ARP broadcast must be
prevented.
6. If the Normal Mode & Inbound ARP request (Sender VM->
Target VR), this is an ARP request (flowed by 7Pass of a re-
mote host) for a remote orphan VM to discover a VR. If the lo-
cal VR is overloaded, 6-1Filtering for QoS. If not, 6-2Proxy.
7. If the Orphan Mode & Outbound ARP request (Sender VM->
Target VR), this is an ARP request for a local orphan VM to
discover a remote VR. 7Pass to discover a remote VR.
8. If the Orphan Mode & Inbound ARP request (Sender VM->
Target VR), this is an ARP request (flowed by 7Pass of a re-
mote host) for a remote orphan VM to discover a VR, but
8Filtering because there is not a local VR in the Orphan Mode
and ARP broadcast must be prevented.
9. If the Normal Mode & Outbound ARP reply (Sender VR->
Target VM), this is an ARP reply (to 7Pass) for a local VR to
answer a remote orphan VM, but 9N/A because the ARP reply
was already processed by 6-1Filtering or 6-2Proxy.
10. If the Normal Mode & Inbound ARP reply (Sender VR->
Target VM), this is an ARP reply for a remote VR to answer a
local VM, but 10N/A because the ARP request was already
processed by 5Filtering.
11. If the Orphan Mode & Outbound ARP reply (Sender VR->
Target VM), this is an ARP reply for a local VR to answer a
remote orphan VM, but 11N/A because there is not a local VR
in the Orphan Mode.
12. If the Orphan Mode & Inbound ARP reply (Sender VR->
Target VM), these are ARP replies (to 7Pass) for multiple re-
mote VRs to answer a local orphan VM. In a normal situation,
communication mechanism does not allow more than one reply
to a single ARP request. The problem is that a large number of
underloaded VRs can send ARP reply, that is, ARP flux prob-
lem occurs. To solve this issue, the agent simply chooses the on-
ly fastest ARP reply and filters the rest (12Pass & Filtering).
13. If the Normal Mode & Outbound ARP reply (Sender VM->
Target VR), this is an ARP reply for a local VM to answer a
remote VR, but 13N/A because the ARP request was already
processed by 2Filtering.
14. If the Normal Mode & Inbound ARP reply (Sender VM->
Target VR), this is an ARP reply (to 1-1Pass) for a rem-
ote orphan VM to answer a local VR and therefore 14Pass.
15. If the Orphan Mode & Outbound ARP reply (Sender VM->
Target VR), this is an ARP reply (to 1-1Pass) for a local orphan
VM to answer a remote VR, but 15N/A because the ARP re-
quest was already processed by 4-2Proxy.
16. If the Orphan Mode & Inbound ARP reply (Sender VM->
Target VR), this is an ARP reply for a remote orphan VM to an-
swer a local VR, but 16N/A because there is not a local VR in
the Orphan Mode.
17. If the Normal Mode & Outbound GARP (Sender VR-> Target
VR), this is a GARP request for a local VR to update all VMs &
5. switchesâ caches and detect IP conflicts. When a local VR start
(Orphan Mode->Normal Mode), GARP request is sent by the
local VR. 17Filtering to update only local VMsâ ARP cache and
prevent IP address conflicts with all remote VRs.
18. If the Normal Mode & Inbound GARP (Sender VR-> Target
VR), this is a GARP request for a remote VR to update all
VMsâ ARP caches and detect IP conflicts when a remote VR
start, but 18N/A because the GARP request was already pro-
cessed by the remote agentâs 17Filtering.
19. If the Orphan Mode & Outbound GARP (Sender VR-> Target
VR), this is a GARP request that a local VR sends. 19N/A be-
cause there is not a local VR in the Orphan Mode.
20. If the Orphan Mode & Inbound GARP (Sender VR-> Target
VR), this is a GARP request that a remote VR sends. 20N/A
because the GARP request was already processed by the remote
agentâs 17Filtering.
21-24. If the inter-VM communication, this packet is a ARP re-
quest for a VM to discover another VM. If an ARP request for a
local VM to discover a remote VM, 21-3Proxy or 21-1Pass
(23-3Proxy or 23-1Pass) as the presence or absence of the
MAC entry in the agentâs ARP cache. If an ARP request for a
local VM to discover another local VM, 21-2Filtering or 23-
2Filtering to prevent ARP broadcast. If an ARP request for a
remote VM to discover another remote VM, 22-1Filtering or
24-1Filtering to prevent ARP broadcast. If an ARP request for a
remote VM to discover a local VM, 22-2Proxy or 24-2Proxy.
4. EVALUATION
In this section we evaluate EYWA using a prototype running on
a 10 server testbed and 1 commodity switch (2 10Gbps ports and
24 1Gbps ports) in a single rack where 10 servers are connected in
1Gbps. The layer-4 load balancer inside the VR is based on open
source HAProxy. Our goals are first to show that EYWA can be
built from components that are available today, and second, that
our implementation solves the problems of public network de-
scribed in Section 2. The issues of private network such as east-
west traffic in a single tenant are excluded from the evaluation
because it is clearly obvious they will turn out that way.
4.1 North-South Traffic
In this section, we show that all the VMs can utilize all the
physical bandwidth without throughput bottleneck when the VMs
communicate with external servers.
4.1.1 Outbound communications
First, there are a single VR and VM belonging to a same tenant
in every hypervisor host, that is, all of them are the Normal Mode.
All the VMs send packets to external servers at full bandwidth.
The total outbound bandwidth of all the VMs is equal to the sum
of each VMâs physical bandwidth as illustrated Figure 5(a), and
the same pattern is true of the total inbound bandwidth.
4.1.2 Outbound communications in the Auto-Scaling
scenario of VRs and VMs
We evaluate on the assumption that the auto-scaling of cloud
platform can launch or terminate a new VR and VM according to
scaling policies such as network bandwidth. We only launch or
terminate the instances manually to evaluate. Test environment is
equal to section 4.1.1, but there can be a maximum of a single
VM and VR in every hypervisor host.
Figure 6 shows how the total outbound bandwidth of all the
VMs increases and decreases when the auto-scaling launches or
terminates a VR and VM separately. This will dramatically in-
crease and decrease the total outbound bandwidth, and the same
pattern is true of the total inbound bandwidth.
4.2 East-West (Inter-Tenant) Traffic
In this section, we also show that all VMs can utilize all the
physical bandwidth without throughput bottleneck when the VMs
communicate with another tenantâs VMs.
4.2.1 1-to-1 communications
Test environment is equal to section 4.1.1, but 10 hypervisor
hosts are divided equally between tenant A and B. There are also
a VR and VM belonging to a same tenant in every hypervisor host.
Each VM of tenant A sends packets to an idle VM of tenant B at
full bandwidth. The average outbound bandwidth per VM of ten-
ant A is equal to each VMâs physical bandwidth as illustrated
Figure 5(b).
4.2.2 1-to-N communications
Test environment is equal to section 4.2.1, but Each VM of ten-
ant A sends packets to all the VMs of tenant B and each VM of
tenant B also sends packets to all the VMs of tenant A at full
bandwidth. The total outbound bandwidth of host pairs is equal to
the sum of VM pairâs physical bandwidth as illustrated Figure
5(c).
4.3 North-South and East-West Traffic
In this section, we also show that all VMs can utilize all the
physical bandwidth without throughput bottleneck when the VMs
communicate simultaneously with external servers and another
tenantâs VMs.
4.3.1 Outbound and 1-to-N communications
10 hypervisor hosts are also divided equally between tenant A
and B. There are a single VR and 2 VMs (e.g., e-w and n-s VM)
belonging to a same tenant in every hypervisor host. Each e-w
VM of tenant A sends packets to all the e-w VMs of tenant B at
full bandwidth, and each n-s VM of tenant A also sends packets to
external servers at full bandwidth. The tenant B is the same. The
total outbound bandwidth of host pairs is equal to the sum of VM
pairâs physical bandwidth as illustrate Figure 7. If a single router
environments, the total outbound bandwidth of host pair is equal
to the sum of 2 VMsâ physical bandwidth in every condition.
5. RELATED WORK
Virtual network designs for multi-tenant cloud environ-
ments: OpenStack/Neutron/Distributed Virtual Router (DVR)
[33] provides a virtual LAN per tenant and distributed DNAT, but
SNAT is still centralized in Network Node. It has also no large
layer-2 semantics.
OpenStack/MidoNet [32] is an open network virtualization sys-
tem. The agent in every host is responsible for setting up new
network flows and controlling and the kernel fastpath to provide
distributed networking services (switching, routing, NAT, etc.),
All traffic from the external network is handled (routing, security
groups, firewalls, and load balancing) by the Gateways in dedicat-
ed servers. Network State Database in dedicated servers stores
high level configuration information like topology, routes, NAT
6. Figure 5: Network bandwidth on north-south or east-west communication
Figure 6: Total north-south network bandwidth increased and decreased by auto-scaled VMs and VRs
Figure 7: Network bandwidth competition on north-south and
east-west communication
settings, etc. MidoNet requires additional servers or components
except an agent in every host and has no large layer-2 semantics.
CloudStack [17] provides a HAProxy-based VR and virtual
LAN per tenant as illustrated in Figure 3, but VR is a throughput
bottleneck and SPOF. It has also no large layer-2 semantics.
AWS/Virtual Private Cloud (VPC) [29] provides a logically iso-
lated section of the AWS Cloud. Internet Gateway is a redundant
and highly available VPC component that allows communication
between instances in VPC and the Internet. Internet Gateway
serves three purposes: to provide a target in VPC route tables for
Internet-routable traffic, perform NAT for instances that have
been assigned public IP addresses, and proxy all ARP requests
and replies. Internet Gateway can be a bottleneck and cannot
proxy all ARP packets by itself in the large layer-2 network.
Physical network designs for Data Centers: Monsoon [1] im-
plements a large layer-2 network in data centers. It is designed on
top of layer 2 and reinvents fault-tolerant routing mechanisms
already established at layer 3 and the centralized directory servers
store all address information.
SEATTLE [3] also implements a large layer-2 network in data
centers. It stores the location at which each server is connected to
the network in a one hop DHT distributed across the switches.
VL2 [2] provides hot-spot-free routing and scalable layer-2 se-
mantics using forwarding primitives available today and minor,
application-compatible modifications to host operating systems,
and the centralized directory servers store all address information.
Fat-tree [9] relies on a customized routing primitive that does
not yet exist in commodity switches.
Transparent Interconnection of Lots of Links (TRILL) [13] is a
layer-2 forwarding protocol that operates within one IEEE 802.1-
compliant Ethernet broadcast domain. It replaces the STP by us-
ing Intermediate System to Intermediate System (IS-IS) routing to
distribute link state information and calculate shortest paths
through the network.
802.1aq Shortest Path Bridging (SPB) [14] allows for true
shortest path forwarding in a mesh Ethernet network context
utilizing multiple equal cost paths. This permits it to support
much larger layer 2 topologies, with faster convergence, and
vastly improved use of the mesh topology.
Layer-4 Load Balancing: Ananta [4] is a layer-4 distributed lo
ad balancer and NAT for a multi-tenant cloud environment. Its
7. components are an agent in every hypervisor host that can take
over the packet modification function from the load balancer, a
virtual switch in every hypervisor host that provide NAT function,
Multiplexers in dedicated servers that can handle all incoming
traffic and Manager in dedicated servers that implements the con-
trol plane of Ananta. It does not use DNS-based load balancing.
OpenStack/Neutron/Load Balancing as a Service (LBaaS) [40]
allows for proprietary and open source load balancing technolo-
gies to drive the actual load balancing of requests. Thus, an
OpenStack operator can choose which back-end technology to use.
AWS/Elastic Load Balancing (ELB) [30] automatically distrib-
utes incoming application traffic across multiple Amazon EC2
instances using DNS-based load balancing in the AWS cloud.
HAProxy [35] is an open source solution offering load balanc-
ing and proxying for TCP and HTTP-based applications.
Tunneling protocols: Network Virtualization using Generic
Routing Encapsulation (NVGRE) [6] uses GRE as the encapsula-
tion method. It uses the lower 24 bits of the GRE header to repre-
sent the Tenant Network Identifier (TNI). Like VxLAN, this 24
bit space allows for 16 million virtual networks.
Stateless Transport Tunneling (STT) [7] is another tunneling
protocol. The other advantage of STT is its use of a 64 bit net-
work ID rather than the 24 bit IDs used by NVGRE and VxLAN.
6. CONCLUSION
We presented the design of EYWA, a new virtual network ar-
chitecture that accommodate a large number of tenants using iso-
lated virtual LAN, provide public network per tenant without
bottleneck & SPOF on SNAT/DNAT and a single large IP subnet
per tenant using large layer-2 semantics in multi-tenant cloud
environments and therefore this benefits the cloud service provid-
ers and consumers.
EYWA is a simple design that can be realized with available
networking technologies, and without changes to hostsâ kernel,
physical and software switch. This does not also require any addi-
tional servers or components except for a distributed and inde-
pendent agent in every hypervisor host, that is, there is no need to
manage something additional and centralized. In future work,
EYWA will be integrated with open source solutions for IaaS and
support real huge cloud data centers with unlimited scalability.
The limitations of the virtual environments must be limited by
the physical environments rather than themselves. EYWA scales
with the size of the physical network and compute farm.
7. REFERENCES
[1] A. Greenberg, P. Lahiri, D. A. Maltz, P. Patel, and S.
Sengupta. Towards a next generation data center architec-
ture: Scalability and commoditization. In PRESTO Work-
shop at SIGCOMM, 2008.
[2] A. Greenberg, James R. Hamilton, Navendu Jain, Srikanth
Kandula, Changhoon Kim, Parantap Lahiri, David A. Maltz,
Parveen Patel and Sudipta Sengupta. VL2: A Scalable and
Flexible Data Center Network. In SIGCOMM, 2009.
[3] C. Kim, M. Caesar, and J. Rexford. Floodless in SEATTLE:
a scalable ethernet architecture for large enterprises. In
SIGCOMM, 2008.
[4] Parveen Patel, Deepak Bansal, Lihua Yuan, Ashwin Murthy,
Albert Greenberg, David A. Maltz, Randy Kern, Hemant
Kumar, Marios Zikos, Hongyu Wu, Changhoon Kim,
Naveen Karri. Ananta: Cloud Scale Load Balancing. In
SIGCOMM, 2013.
[5] M. Mahalingam, D. Dutt, K. Duda, P. Agarwal, L. Kreeger,
T. Sridhar, M. Bursell and C. Wright. VXLAN: A Frame-
work for Overlaying Virtualized Layer 2 Networks over Lay-
er 3 Networks, IETF Internet Draft.
[6] M, Sridharan, K. Duda, I. Ganga, A. Greenberg, G. Lin, M.
Pearson, P. Thaler, C. Tumuluri, N. Venkataramiah, Y.
Wang. NVGRE: Network Virtualization using Generic Rout-
ing Encapsulation, IETF Internet Draft.
[7] B. Davie, Ed. J. Gross, A Stateless Transport Tunneling Pro-
tocol for Network Virtualization (STT), IETF Internet Draft.
[8] Radhika Niranjan Mysore, Andreas Pamboris, Nathan Far-
rington, Nelson Huang, Pardis Miri, Sivasankar Radhakrish-
nan, Vikram Subramanya, and Amin Vahdat. PortLand: A
Scalable Fault-Tolerant Layer 2 Data Center Network Fabric.
In SIGCOMM, 2009
[9] Mohammad Al-Fares, Alexander Loukissas and Amin
Vahdat. A Scalable, Commodity Data Center network Archi-
tecture. In SIGCOMM, 2008
[10] N. Mckeown, T. Anderson, H. Balakrishnan, G. M. Parulkar,
L. L. Peterson, J. Rexford, S. Shenker, and J. S. Turner.
OpenFlow: Enabling Innovation in Campus Networks. In
SIGCOMM, 2008
[11] IEEE 802.1Q VLANs, Media Access Control Bridges and
Virtual Bridged Local Area Networks
[12] S. Nadas, Ed. Ericsson, Virtual Router Redundancy Protocol
(VRRP) Version 3 for IPv4 and IPv6. IETF RFC 5798.
[13] R. Perlman et al. TRILL: Transparent Interconnection of
Lots of Links. IETF RFC
[14] IEEE 802.1aq Shortest Path Bridging
[15] D. Allan, N. Bragg, P. Unbehagen. IS-IS Extensions Sup-
porting IEEE 802.1aq Shortest Path Bridging, IETF RFC
[16] Openstack, http://www.openstack.org
[17] Apache Cloudstack, http://cloudstack.apache.org
[18] Eucalyptus, http://www.eucalyptus.com
[19] OpenNebula, http://opennebula.org
[20] Amazon Web Services, http://aws.amazon.com
[21] Microsoft, https://www.microsoft.com
[22] Microsoft Azure, http://azure.microsoft.com
[23] Vmware, http://www.vmware.com
[24] Rackspace Open Cloud, http://www.rackspace.com/cloud
[25] Google Compute Engine, http://cloud.google.com/compute
[26] IBM Cloud, http://www.ibm.com/cloud-computing
[27] Ucloud biz, https://ucloudbiz.olleh.com
[28] OpenFlow, http://archive.openflow.org
[29] AWS Virtual Private Cloud (VPC),
http://aws.amazon.com/vpc
[30] AWS Elastic Load Balancing (ELB),
http://aws.amazon.com/elasticloadbalancing
[31] AWS Route 53, http://aws.amazon.om/route53
[32] MidoNet, http://www.midokura.com/midonet
[33] Openstack/Neutron/Distributed Virtual Router (DVR),
https://wiki.openstack.org/wiki/Neutron/DVR
[34] NetScalar VPX Virtual Appliance. http://www.citrix.com
[35] HAProxy Load Balancer, http://www.haproxy.org
[36] Linux Virtual Server, http://www.linuxvirtualserver.org
[37] Vyatta Virtual Router, http://www.brocade.com
[38] OVS Virtual Switch, http://openvswitch.org
[39] Linux Bridge, http://www.linuxfoundation.org
[40] OpenStack/Neutron/LBaaS,
https://wiki.openstack.org/wiki/Neutron/LBaaS
[41] EYWA simple POC, https://goo.gl/A1dMJ0
[42] EYWA Presentation, https://goo.gl/wMjCgI