Copyright 2018 FUJITSU LABORATORIES LIMITED
Chunghan Lee, Katsuhito Asano, Tomohiro Ishihara
June 29, 2018
Fujitsu Laboratories Ltd.
IEEE PVE-SDN 2018
Introduction
 OpenStack (https://www.openstack.org)
 It is OSS/ecosystem for cloud management
 It has been widely used by many cloud operators
 Overview of public cloud based on OpenStack
 Compute Node : Deploy VMs
 Network Node : Provide L3 routing via virtual router
 Network fabric : Inter-connector between VMs
1 Copyright 2018 FUJITSU LABORATORIES LIMITED
Previous works on measuring cloud network
 Many studies have focused on cloud network
performance and depended on information at VM side
 Few prior works of the latency on the virtual network
 One study measured the latency of virtual networks, such as Xen,
Linux, and VServer, in a local environment [Sigcomm CCR 41]
 Their results are too limited to be generalized to the latency
of cloud network architecture based on OpenStack
2 Copyright 2018 FUJITSU LABORATORIES LIMITED
Research goal
3 Copyright 2018 FUJITSU LABORATORIES LIMITED
Understanding the latency of
software-based virtual
network on the public cloud
Approach
 Focus on packet traces on host side
 Measure VM-to-VM TCP throughput on the public cloud
 Capture their packets at multiple capture points on the host
 Characterize the impact of software-based virtual network in the
public cloud through quantitative analyses
4 Copyright 2018 FUJITSU LABORATORIES LIMITED
Target cloud based on OpenStack
 This public cloud has been in production use since 2015
 Compute Node
•Have 48CPU cores, 256GB RAM, and a 10Gbps NIC
•Deploy approximately 40-50 VMs
 Network Node
•Provide L3 routing via virtual routers for multi tenants
 Network fabric for L2 forwarding
•VLAN and MLAG are used to separate tenant traffic and all of physical
links are 10 Gbps
5 Copyright 2018 FUJITSU LABORATORIES LIMITED
Software-based Virtual Network
Linux bridge
•iptables on Linux bridge is
provided as a security group
•Each cloud user can freely
add entries to iptables
Open vSwitch (OVS)
•It is responsible for
external/internal
communications
6 Copyright 2018 FUJITSU LABORATORIES LIMITED
Virtual network on the compute node
Will capture
packets on
multiple points
Questions for measuring precise latency
7 Copyright 2018 FUJITSU LABORATORIES LIMITED
Node selection
 We focused on low utilization for suitable nodes
 Measure CPU and memory usage, traffic volume, and the number
of VMs at each node
 Consider the resource metrics as candidates for low utilization
 Overview of selected nodes
 There were no abrupt changes in the metrics
•We believe that it can indicate the low utilization
 VMs that had one CPU core and 4GB RAM were deployed for the
measurement at different racks
•To avoid extra overhead by iptables (Linux bridge),
we did not add any rules to the security group
8 Copyright 2018 FUJITSU LABORATORIES LIMITED
Number of VMs Idle CPUs [%] Memory [%] Traffic volume [Gbps]
40 VMs (48 CPUs) 95 70 Smaller than 1Gbps*
* The traffic volume is in the night
< Resource metrics >
Measurement method : VM-to-VM
 Overview of experiment
 Measure TCP throughput using iperf3 for 30 seconds
 Control write lengths (128, 256, 512, 1024, 2048, or 65535 bytes)
 Capture packets at NIC (eth) and vports (tap and qvb) concurrently
•Analyze well-known metrics and Data/ACK path latency
9 Copyright 2018 FUJITSU LABORATORIES LIMITED
O : capture point
Packet capture overhead
 Overview of the capture overhead at our measurement
 Total CPU utilization for the capture : below 5%
•The compute node had 48 CPUs and three idle CPUs were used
 The utilization of each CPU core : roughly 50-60%
 Throughput with/without captures (length : 1024 bytes)
 The throughput with the packet capture is similar to that without
the packet capture
 We believe that the capture overhead would be small
10 Copyright 2018 FUJITSU LABORATORIES LIMITED
Write lengths
[bytes]
128 256 512 1024 2048 65535
(64 KB)
Mean throughput
[Mbps]
721 1375 2524 4389 6382 6707
TCP Throughput
 Throughput statistics
 We ran our experiments in the night
•The traffic volume was smaller than 1 Gbps in the night
 Throughput with 2048 bytes is similar to that with 65535 bytes
•The TCP throughput with 64 KB was similar to the maximum
throughput on our cloud
 While each result had different throughput, well-known metrics
were fundamentally similar
11 Copyright 2018 FUJITSU LABORATORIES LIMITED
We select the representative result
with 1024 bytes
RTT and latency in our measurement
 RTT
 Indicates the time between
sending a data packet and
receiving an ACK packet
 Three RTT measurement points
•One NIC : eth
•Two vports : tap and qvb
 Latency
 Indicates the time consumed
by the software-based virtual
network on server
 Four sections of latency
•Data sending and Data receiving
•ACK sending and ACK receiving
12 Copyright 2018 FUJITSU LABORATORIES LIMITED
RTT between capture points
 RTT (1024 bytes)
 Compare statistics of RTT with those on MS DCN (Sigcomm 2015)
 Find the overhead induced by the software-based virtual network
13 Copyright 2018 FUJITSU LABORATORIES LIMITED
RTT Mean [μs] 50%tile [μs] 99%tile [μs]
tap 386 377 871
qvb 331 322 714
eth 305 300 666
MS DCN NO result 268 1340
Latency on virtual network
 Latency (1024 bytes)
 The total latency is 136μs and
it is approximately 35.2% of RTT
 The receiving side (eth→tap) is
heavier than the sending
side (tap→eth)
14 Copyright 2018 FUJITSU LABORATORIES LIMITED
Latency Mean [μs] 50%tile [μs] 99%tile [μs]
RTT of tap on our cloud 386 377 871
Total latency on our cloud 136 124 461
Total latency on local 26 26 46
Receiving
side
Data sending (tap→eth) 24 22 54
Data receiving (eth→tap) 44 44 99
ACK sending (tap→eth) 18 17 42
ACK receiving (eth→tap) 50 41 266
OVS latency versus Linux bridge latency
 Mean latency at each software component
 Regardless of the ACK and data path, the latency at the receiving
is higher than that at the sending
 OVS is lower than Linux bridge (iptables)
•Flow-cache (mega flow) on OVS would be a cause of low latency
in comparison with Linux bridge
15 Copyright 2018 FUJITSU LABORATORIES LIMITED
Burstiness of latency – (1)
 What is the burstiness of latency in this research?
 The abrupt fluctuation of latency can occur due to software
processing
 RTT and latency behavior
 Similar behavior is observed between RTT and total latency
16 Copyright 2018 FUJITSU LABORATORIES LIMITED
Latency[s]RTT[s]
Elapsed time [s]
Burstiness of latency – (2)
 The quantitative characteristics of burstiness of latency
 Apply the burst detection algorithm by Kleinberg (SIGKDD’02)
 Detect the 856 burstiness events
 Mark 10% of burstiness on the RTT
17 Copyright 2018 FUJITSU LABORATORIES LIMITED
<RTT and 10% of burstiness (a red bar corresponds to the burstiness)>
Mean Top 10% Greatest
Burst period [ms] 1.9 5.5 25
The burstiness of latency can be
a major cause of the increased RTT
Summary & consideration
Summary in our measurement
 The total latency was increased by the receiving side regardless
of data and ACK paths
 The total latency is approximately 35.2% of RTT
 10% of burstiness mainly contributes to the increased RTT
Consideration
 The number of CPUs for packet match/action on the host
 The number of entries at iptables
 The performance of Vhost thread/vswitch
18 Copyright 2018 FUJITSU LABORATORIES LIMITED
Causes for consideration – (1)
19 Copyright 2018 FUJITSU LABORATORIES LIMITED
Causes for consideration – (2)
20 Copyright 2018 FUJITSU LABORATORIES LIMITED
This limitation would not be the major cause
Conclusion
 Analyze the impact of software-based virtual network on
latency on the public cloud based on OpenStack
 There was no assumption about the cloud
•The packet traces on the host side are directly used
 The total latency was increased by the receiving side regardless
of data and ACK paths
 The characteristics of latency and their burstiness were
quantitatively clarified
 The latency is approximately 35.2% of RTT and 10% of
burstiness mainly contributes to the increased RTT
21 Copyright 2018 FUJITSU LABORATORIES LIMITED
Future work
Investigate the impact of virtual network on
different environment and architecture
Make clear a major cause of the increased
latency on the software-based virtual network
22 Copyright 2018 FUJITSU LABORATORIES LIMITED
TCP Throughput with 1024 bytes
 Throughput at eth(NIC)
 Mean throughput is 4389 [Mbps]
 Major traffic is the measurement
 There is a small fraction of
background traffic
•The impact of background
would be small to the measurement
24 Copyright 2018 FUJITSU LABORATORIES LIMITED
Network applications on target cloud
 How to use the public cloud?
 Targets are enterprise customers and their services
•e-mail, testbed, database, storage, Web services, and development
environments
 Major network applications
 99% of traffic was TCP
 X11 protocol was occupied 84% of traffic
25 Copyright 2018 FUJITSU LABORATORIES LIMITED
CDF of latency
 CDF curves
 The latency when the packets are received is larger than that of
sending side
26 Copyright 2018 FUJITSU LABORATORIES LIMITED
Sending
side
Receiving
side
RTT
The similar characteristics is observed

The Impact of Software-based Virtual Network in the Public Cloud

  • 1.
    Copyright 2018 FUJITSULABORATORIES LIMITED Chunghan Lee, Katsuhito Asano, Tomohiro Ishihara June 29, 2018 Fujitsu Laboratories Ltd. IEEE PVE-SDN 2018
  • 2.
    Introduction  OpenStack (https://www.openstack.org) It is OSS/ecosystem for cloud management  It has been widely used by many cloud operators  Overview of public cloud based on OpenStack  Compute Node : Deploy VMs  Network Node : Provide L3 routing via virtual router  Network fabric : Inter-connector between VMs 1 Copyright 2018 FUJITSU LABORATORIES LIMITED
  • 3.
    Previous works onmeasuring cloud network  Many studies have focused on cloud network performance and depended on information at VM side  Few prior works of the latency on the virtual network  One study measured the latency of virtual networks, such as Xen, Linux, and VServer, in a local environment [Sigcomm CCR 41]  Their results are too limited to be generalized to the latency of cloud network architecture based on OpenStack 2 Copyright 2018 FUJITSU LABORATORIES LIMITED
  • 4.
    Research goal 3 Copyright2018 FUJITSU LABORATORIES LIMITED Understanding the latency of software-based virtual network on the public cloud
  • 5.
    Approach  Focus onpacket traces on host side  Measure VM-to-VM TCP throughput on the public cloud  Capture their packets at multiple capture points on the host  Characterize the impact of software-based virtual network in the public cloud through quantitative analyses 4 Copyright 2018 FUJITSU LABORATORIES LIMITED
  • 6.
    Target cloud basedon OpenStack  This public cloud has been in production use since 2015  Compute Node •Have 48CPU cores, 256GB RAM, and a 10Gbps NIC •Deploy approximately 40-50 VMs  Network Node •Provide L3 routing via virtual routers for multi tenants  Network fabric for L2 forwarding •VLAN and MLAG are used to separate tenant traffic and all of physical links are 10 Gbps 5 Copyright 2018 FUJITSU LABORATORIES LIMITED
  • 7.
    Software-based Virtual Network Linuxbridge •iptables on Linux bridge is provided as a security group •Each cloud user can freely add entries to iptables Open vSwitch (OVS) •It is responsible for external/internal communications 6 Copyright 2018 FUJITSU LABORATORIES LIMITED Virtual network on the compute node Will capture packets on multiple points
  • 8.
    Questions for measuringprecise latency 7 Copyright 2018 FUJITSU LABORATORIES LIMITED
  • 9.
    Node selection  Wefocused on low utilization for suitable nodes  Measure CPU and memory usage, traffic volume, and the number of VMs at each node  Consider the resource metrics as candidates for low utilization  Overview of selected nodes  There were no abrupt changes in the metrics •We believe that it can indicate the low utilization  VMs that had one CPU core and 4GB RAM were deployed for the measurement at different racks •To avoid extra overhead by iptables (Linux bridge), we did not add any rules to the security group 8 Copyright 2018 FUJITSU LABORATORIES LIMITED Number of VMs Idle CPUs [%] Memory [%] Traffic volume [Gbps] 40 VMs (48 CPUs) 95 70 Smaller than 1Gbps* * The traffic volume is in the night < Resource metrics >
  • 10.
    Measurement method :VM-to-VM  Overview of experiment  Measure TCP throughput using iperf3 for 30 seconds  Control write lengths (128, 256, 512, 1024, 2048, or 65535 bytes)  Capture packets at NIC (eth) and vports (tap and qvb) concurrently •Analyze well-known metrics and Data/ACK path latency 9 Copyright 2018 FUJITSU LABORATORIES LIMITED O : capture point
  • 11.
    Packet capture overhead Overview of the capture overhead at our measurement  Total CPU utilization for the capture : below 5% •The compute node had 48 CPUs and three idle CPUs were used  The utilization of each CPU core : roughly 50-60%  Throughput with/without captures (length : 1024 bytes)  The throughput with the packet capture is similar to that without the packet capture  We believe that the capture overhead would be small 10 Copyright 2018 FUJITSU LABORATORIES LIMITED
  • 12.
    Write lengths [bytes] 128 256512 1024 2048 65535 (64 KB) Mean throughput [Mbps] 721 1375 2524 4389 6382 6707 TCP Throughput  Throughput statistics  We ran our experiments in the night •The traffic volume was smaller than 1 Gbps in the night  Throughput with 2048 bytes is similar to that with 65535 bytes •The TCP throughput with 64 KB was similar to the maximum throughput on our cloud  While each result had different throughput, well-known metrics were fundamentally similar 11 Copyright 2018 FUJITSU LABORATORIES LIMITED We select the representative result with 1024 bytes
  • 13.
    RTT and latencyin our measurement  RTT  Indicates the time between sending a data packet and receiving an ACK packet  Three RTT measurement points •One NIC : eth •Two vports : tap and qvb  Latency  Indicates the time consumed by the software-based virtual network on server  Four sections of latency •Data sending and Data receiving •ACK sending and ACK receiving 12 Copyright 2018 FUJITSU LABORATORIES LIMITED
  • 14.
    RTT between capturepoints  RTT (1024 bytes)  Compare statistics of RTT with those on MS DCN (Sigcomm 2015)  Find the overhead induced by the software-based virtual network 13 Copyright 2018 FUJITSU LABORATORIES LIMITED RTT Mean [μs] 50%tile [μs] 99%tile [μs] tap 386 377 871 qvb 331 322 714 eth 305 300 666 MS DCN NO result 268 1340
  • 15.
    Latency on virtualnetwork  Latency (1024 bytes)  The total latency is 136μs and it is approximately 35.2% of RTT  The receiving side (eth→tap) is heavier than the sending side (tap→eth) 14 Copyright 2018 FUJITSU LABORATORIES LIMITED Latency Mean [μs] 50%tile [μs] 99%tile [μs] RTT of tap on our cloud 386 377 871 Total latency on our cloud 136 124 461 Total latency on local 26 26 46 Receiving side Data sending (tap→eth) 24 22 54 Data receiving (eth→tap) 44 44 99 ACK sending (tap→eth) 18 17 42 ACK receiving (eth→tap) 50 41 266
  • 16.
    OVS latency versusLinux bridge latency  Mean latency at each software component  Regardless of the ACK and data path, the latency at the receiving is higher than that at the sending  OVS is lower than Linux bridge (iptables) •Flow-cache (mega flow) on OVS would be a cause of low latency in comparison with Linux bridge 15 Copyright 2018 FUJITSU LABORATORIES LIMITED
  • 17.
    Burstiness of latency– (1)  What is the burstiness of latency in this research?  The abrupt fluctuation of latency can occur due to software processing  RTT and latency behavior  Similar behavior is observed between RTT and total latency 16 Copyright 2018 FUJITSU LABORATORIES LIMITED Latency[s]RTT[s] Elapsed time [s]
  • 18.
    Burstiness of latency– (2)  The quantitative characteristics of burstiness of latency  Apply the burst detection algorithm by Kleinberg (SIGKDD’02)  Detect the 856 burstiness events  Mark 10% of burstiness on the RTT 17 Copyright 2018 FUJITSU LABORATORIES LIMITED <RTT and 10% of burstiness (a red bar corresponds to the burstiness)> Mean Top 10% Greatest Burst period [ms] 1.9 5.5 25 The burstiness of latency can be a major cause of the increased RTT
  • 19.
    Summary & consideration Summaryin our measurement  The total latency was increased by the receiving side regardless of data and ACK paths  The total latency is approximately 35.2% of RTT  10% of burstiness mainly contributes to the increased RTT Consideration  The number of CPUs for packet match/action on the host  The number of entries at iptables  The performance of Vhost thread/vswitch 18 Copyright 2018 FUJITSU LABORATORIES LIMITED
  • 20.
    Causes for consideration– (1) 19 Copyright 2018 FUJITSU LABORATORIES LIMITED
  • 21.
    Causes for consideration– (2) 20 Copyright 2018 FUJITSU LABORATORIES LIMITED This limitation would not be the major cause
  • 22.
    Conclusion  Analyze theimpact of software-based virtual network on latency on the public cloud based on OpenStack  There was no assumption about the cloud •The packet traces on the host side are directly used  The total latency was increased by the receiving side regardless of data and ACK paths  The characteristics of latency and their burstiness were quantitatively clarified  The latency is approximately 35.2% of RTT and 10% of burstiness mainly contributes to the increased RTT 21 Copyright 2018 FUJITSU LABORATORIES LIMITED
  • 23.
    Future work Investigate theimpact of virtual network on different environment and architecture Make clear a major cause of the increased latency on the software-based virtual network 22 Copyright 2018 FUJITSU LABORATORIES LIMITED
  • 25.
    TCP Throughput with1024 bytes  Throughput at eth(NIC)  Mean throughput is 4389 [Mbps]  Major traffic is the measurement  There is a small fraction of background traffic •The impact of background would be small to the measurement 24 Copyright 2018 FUJITSU LABORATORIES LIMITED
  • 26.
    Network applications ontarget cloud  How to use the public cloud?  Targets are enterprise customers and their services •e-mail, testbed, database, storage, Web services, and development environments  Major network applications  99% of traffic was TCP  X11 protocol was occupied 84% of traffic 25 Copyright 2018 FUJITSU LABORATORIES LIMITED
  • 27.
    CDF of latency CDF curves  The latency when the packets are received is larger than that of sending side 26 Copyright 2018 FUJITSU LABORATORIES LIMITED Sending side Receiving side RTT The similar characteristics is observed