Influence of the Router Buffer on Online Games Traffic Multiplexing
Simplemux: a generic multiplexing protocol
1. Simplemux Traffic
Optimization in the context
of GAIA
Jose Saldana, University of Zaragoza (jsaldana@unizar.es)
IRTF GAIA Meeting
Prague, July 22, 2015
16/07/2015 1
2. Traffic Optimization in the context of GAIA
The Global Access to the Internet for All (GAIA) is an IRTF initiative that aims*
(1) to create increased visibility and interest among the wider community on
the challenges and opportunities in enabling global Internet access, in terms
of technology as well as the social and economic drivers for its adoption;
(2) to create a shared vision among practitioners, researchers, corporations,
non governmental and governmental organisations on the challenges and
opportunities;
(3) to articulate and foster collaboration among them to address the diverse
Internet access and architectural challenges (including security, privacy,
censorship and energy efficiency);
(4) to document and share deployment experiences and research results to
the wider community through scholarly publications, white papers,
presentations, workshops, Informational and Experimental RFCs;
(5) to document the costs of existing Internet Access, the breakdown of
those costs (energy, manpower, licenses, bandwidth, infrastructure, transit,
peering), and outline a path to achieve a 10x reduction in Internet Access
costs especially in geographies and populations with low penetration.
(6) to develop a longer term perspective on the impact of GAIA research
group findings on the standardisation efforts at the IETF. This could include
recommendations to protocol designers and architects.
16/07/2015 2* IRTF GAIA charter, http://datatracker.ietf.org/rg/gaia/charter/
3. • Tunnel of multiplexed packets
• Different tunneling and multiplexed protocols allowed
Submitted to Transport Area WG (tsvwg@ietf.org)
Implementation of Simplemux+ROHC over IPv4 available at:
https://github.com/TCM-TF/simplemux
Simplemux separator (1-3 bytes)
Tunneling header
Muxed protocol header
Protocol=
Simplemux Protocol=any
Simplemux: a Generic Multiplexing Protocol
http://datatracker.ietf.org/doc/draft-saldana-tsvwg-simplemux/
4. Traffic Optimization in the context of GAIA
What is the main idea?
Join small packets into bigger ones
Reduce the amount of packets
Amortize the tunnel overhead between a higher number of payloads
Eventually compress packets (e.g. with header compression*)
Optimization between different devices and tenants
What can traffic optimization provide?
Bandwidth savings (and airtime in wireless networks)
pps reduction
Energy savings
Flexibility: optimization can be activated when required. Avoid dimensioning the
network for the worst case
At what cost?
CPU
I can multiplex packets already in the buffer, but additional buffering delay may be
used for increasing the multiplexing rate
16/07/2015 4* RFC 5795, RObust Header Compression, https://tools.ietf.org/html/rfc5795
5. 0
10000
20000
30000
40000
50000
60000
70000
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
Numberofpackets
Packet size [bytes]
Packet size histogram Chicago 2015
Traffic profile in a public Internet node
Small packets in the public Internet (trace from CAIDA.org *)
16/07/2015 5
Source: https://data. caida.org/datasets/passive-2015/equinix-chicago/20150219-130000.UTC/equinix-
chicago.dirA.20150219-125911.UTC.anon.pcap.gz. Only first 200,000 packets used
44% packets are
1440 bytes or more
33% packets
are 60 bytes or
less
Average: 782 bytes
6. Overhead in wired networks (Ethernet)
16/07/2015 6
0
100
200
300
400
500
600
700
800
900
1,000
64 264 464 664 864 1064 1264 1464
MaximumThroughput[Mbps]
Frame size [bytes]
Maximum Ethernet Throughput (link speed 1Gbps)
TCP Payload efficiency
Eth payload efficiency
Source: Small Packet Traffic Performance Optimization for 8255x and 8254x Ethernet Controllers,
http://www.intel.com/content/dam/doc/application-note/8255x-8254x-ethernet-controllers-small-packet-traffic-performance-appl-note.pdf
Average: 782 bytes
7. 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 250 500 750 1000 1250 1500 1750 2000 2250 2500
Efficiency(datarate/PHYrate)
Packet size [bytes]
Efficiency (PHY rate=54 Mbps)
1 MPDU, UDP
1 MPDU, TCP
Overhead in 802.11 networks
MAC mechanisms cause a low efficiency, and even lower for small packets
16/07/2015 7
Average: 782 bytes
Source: Model developed by Ginzburg, B.; Kesselman, A., "Performance analysis of A-MPDU and A-MSDU aggregation in
IEEE 802.11n," Sarnoff Symposium, 2007 IEEE , vol., no., pp.1,5, April 30 2007-May 2 2007
8. Frame grouping in 802.11 networks
New versions of Wi-Fi (from 802.11n) include mechanisms for frame
grouping: A-MPDU and A-MSDU.
16/07/2015 8
Source: Ginzburg, B.; Kesselman, A., "Performance analysis of A-MPDU and A-MSDU aggregation in IEEE 802.11n," Sarnoff Symposium, 2007 IEEE ,
vol., no., pp.1,5, April 30 2007-May 2 2007
The maximum number of frames in an A-MPDU is 64. The maximal A-MSDU size is 7935 bytes and thus it may contain at most 5 frames.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 40 50 60
Efficiency(datarate/PHYrate)
Number of grouped frames (1500 bytes)
Efficiency (PHY rate=130Mbps)
A-MPDU, UDP
A-MPDU, TCP
A-MSDU, UDP
A-MSDU, TCP
MPDU
len
CRC
signat
ure
MPDU padding
Subframe 1 Subframe 2 Subframe N...
MPDU delimiter (4) variable 0-3
PHYHDR A-MPDU
PSDU
Size (bytes):
reserved
A-MPDU aggregation
9. Can packets be grouped at higher layers?
- TMux [RFC1692] multiplexes a number of TCP segments between the
same pair of machines.
- PPPMux [RFC3153] is able to multiplex complete IP packets, using
separators. It requires the use of PPP and L2TP.
- Simplemux
16/07/2015 9
IP
IP TCP payload IP TCP payload
TCP payload TCP payload
TMux TMux
IP IP
IP TCP payload IP TCP payload
TCP payload IP TCP payload
tunnel
L2TP
PPP PPPMux PPPMux
IP IP
IP TCP payload IP TCP payload
TCP payload IP TCP payload
tunnel First Simplemux header Non-first Simplemux header
10. Some results with a Simplemux implementation
5 VoIP calls sharing an Ethernet link (RTP with different codecs)
16/07/2015 10Simplemux implementation: https://github.com/TCM-TF/simplemux
10%
15%
20%
25%
30%
35%
40%
45%
50%
20 40 60 80 100 120 140 160 180
BWsaving
Multiplexing period [ms]
Bandwidth Saving. 5 RTP calls
GSM
iLBC20ms
iLBC30ms
PCM-A, PCM-U
11. Some results with a Simplemux implementation
Packet loss reduction in a saturated 802.11 link:
Link: 802.11ac. 5.56GHz. 9Mbps
Offered traffic (IP level): 15,000 pps * 60 bytes = 7.2 Mbps
16/07/2015 11Simplemux implementation: https://github.com/TCM-TF/simplemux
0%
10%
20%
30%
40%
50%
60%
70%
80%
native 1 2 3 4 5 10 15 20 25 30
Packetlosspercentage
Number of multiplexed packets
Packet Loss in a Saturated 802.11 Link (60-byte packets)
no ROHC
ROHC
12. Scenarios: Wireless Community Network
- Optimization can be activated when required
- Substituting 802.11 aggregation in legacy devices (prior to 802.11n)
- Optimization covers a number of hops (in 802.11 it covers only one)
16/07/2015 12
Internet gateway
Internet
Village with
public Wi-Fi
xDSL router
Without multiplexer
Wi-Fi
Wi-Fi
Wi-Fi
Wi-Fi
Wi-Fi
W
i-Fi
Operator
3G femtocell
Remote village
3G
Operator PoP
13. Scenarios: Low-bandwidth residential access
Collaboration router-network operator may save bandwidth in a
limited access network
16/07/2015 13
ISP network Internet
Transport
network
Internet Router
DSLAM BRAS
xDSL router
With multiplexer
embedded
xDSL router
Without multiplexer
Home network with a number
of users (e.g. Internet Café,
access shared by neighbors)
Individual DSL
14. Thanks a lot
16/07/2015 14
Jose Saldana, University of Zaragoza (jsaldana@unizar.es)
IRTF GAIA Meeting
Prague, July 22, 2015
Bar-BOF Today (July 22) 19:40 Hotel reception
Detailed explanation, real tests, discussion, etc.
Acknowledgements: The researchers from University of Zaragoza participating in this
work were funded by the EU H2020 Wi-5 project (Grant Agreement no: 644262).
16. Overhead in wired networks (Ethernet)
16/07/2015 16
IPv4/TCP packet 1500 bytes
η=1460/1500=97% (IP level) 1460/1542=94% (Eth level)
IPv4/UDP/RTP packet of VoIP with two samples of 10 bytes
η=20/60=33% (IP level) 20/102=19% (Eth level)
IPv4/UDP client-to-server packets of Counter Strike (online game)
η=61/89=68% (IP level) 61/131=46% (Eth level)
IPv4 header: 20 bytes
UDP header: 8 bytes
Inter-frame gap: 12 bytes
Eth header: 26 bytes
Eth FCS: 4 bytes
Payload
RTP header: 12 bytes
17. Overhead in 802.11 networks
Medium access is performed before sending data
Additional delays and air time inefficiency
16/07/2015 17
RTSDIFS
SIFS CTS
SIFS Data
SIFS ACK
NAV (RTS)
NAV (CTS)
NAV (Data)
Sender
Receiver
Other nodes
DIFS: DCF Inter Frame Space
SIFS: Short Interframe Space
RTS: Request To Send
CTS: Clear To Send
NAV: Network Allocation Vector
DCF: Distributed Coordination Function
18. Frame grouping in 802.11 networks
New versions of Wi-Fi (from 802.11n) include mechanisms for frame
grouping: A-MPDU and A-MSDU.
A-MPDU: a number of MPDU delimiters each followed by an MPDU.
A-MSDU: multiple payload frames share not just the same PHY, but also the
same MAC header.
In 802.11ac all the frames have an A-MPDU format, even with 1 sub-frame
16/07/2015 18
DA SA len MSDU padding
Subframe 1 Subframe 2 Subframe N...
Subframe hdr
6 6 2 0-2304 0-3
PHYHDR MACHDR A-MSDU FCS
PSDU
Size (bytes):
MPDU
len
CRC
signat
ure
MPDU padding
Subframe 1 Subframe 2 Subframe N...
MPDU delimiter (4) variable 0-3
PHYHDR A-MPDU
PSDU
Size (bytes):
reserved
A-MSDU aggregation A-MPDU aggregation
19. Simplemux: a generic multiplexing protocol*
- Examples:
- A number of IPv4 packets travel together between two Points of Presence
- A number of IPv6 packets traversing an IPv4 network
- A number of RTP VoIP packets travelling between two remote offices
- A number of IP packets travelling through a secure VPN
- A number of LISP packets travelling between two domains (stub networks)
- etc.
16/07/2015 19
*http://datatracker.ietf.org/doc/draft-saldana-tsvwg-simplemux/
IPv4 packet ROHC packetIPv6 packet
Tunneling
header
Protocol=
Simplemux
Protocol=
IP
Protocol=
IP
Protocol=
ROHC
Simplemux headers/separators
20. Simplemux, a generic multiplexing protocol
- Very simple separators (1-3 bytes): two flags, the packet length and
the Protocol Number
- First separator:
- Non-first separators (may or may not include Protocol Number field):
16/07/2015 20
0
SPB LXT LEN (6 bits)
packet length < 64 bytes
1
SPB LXT LEN (14 bits)
packet length ≥ 64 bytes
Protocol (8 bits)
Protocol (8 bits)
0
LXT LEN (7 bits)
packet length < 128 bytes
1
LXT LEN (15 bits)
packet length ≥ 128 bytes
0
LXT LEN (7 bits)
packet length < 128 bytes
1
LXT LEN (15 bits)
packet length ≥ 128 bytes
Protocol (8 bits)
Protocol (8 bits)
First Simplemux header Non-first Simplemux headers
Tunneling header
21. Implementation details*
- An implementation has been built combining (TCM)
- Header Compression: ROHC (https://rohc-lib.org/)
- Multiplexing: Simplemux (https://github.com/TCM-TF/simplemux)
- Tunneling: IPv4 (raw sockets) or UDP
- An upper bound for the delay can be set
- Processing delay:
- Commodity PC (i3): 0.25 ms (N=10 packets)
- Low-cost wireless AP (OpenWRT): 3.5 ms (N=10packets)
- 2700 lines of code (ROHC not included)
16/07/2015 21
* Tunneling Compressing and Multiplexing (TCM) Traffic Flows. Reference Model
http://datatracker.ietf.org/doc/draft-saldana-tsvwg-tcmtf/
Native traffic: Five IPv4/UDP/RTP VoIP packets with two samples of 10 bytes
Optimized traffic (network mode): One IPv4 simplemux Packet including five RTP packets
IPv4 header: 20 bytes Tunnel IP header: 20 bytes
UDP header: 8 bytes Tunnel UDP header: 8 bytes
saving
Native traffic headers: Optimized traffic headers:
Optimized traffic (transport mode): One IPv4 simplemux Packet including five RTP packets
saving
22. Additional scenario: Machine to machine
16/07/2015 22
Internet
Satellite link
Data Center 3
IP sensors
Data Center 2
Data Center 1
Satellite
Terminal
Gateway
Satellite
Terminal
Satellite
Terminal
23. Delay limits
- Multiplexing must be carefully done, taking into account the
available delay budget
- It adds no delay if a number of packets are waiting in the buffer
(it may occur in congested links)
- Limit the additional buffering delays caused by packet grouping
- VoIP (150 ms)
- Online games (100 to 1000 ms)
- Remote desktop (200 ms)
- IoT samples (Constrained network scenarios, core)
- A draft surveying these limits is available*
16/07/2015 23
* Delay Limits and Multiplexing Policies to be employed with Tunneling Compressing and Multiplexing
Traffic Flows
http://datatracker.ietf.org/doc/draft-suznjevic-tsvwg-mtd-tcmtf/
24. Summary
- If I have a number of packets in the buffer, I can send them into a single
packet, including a number of hops (e.g. 802.11 Community Networks).
- Substituting A-MPDU in legacy 802.11 systems (prior to 11n)
- MAC airtime improvement: single packet vs a number of them.
- PPS reduction.
- Energy savings.
- Bandwidth savings, if combined with header compression. We can use a
single tunnel header, which overhead is amortized between all the
packets.
16/07/2015 24
Native
Simplemux
ingress egress
Common network segment