www.geant.org
Data Transfer Experiments over 100G Paths
using Data Transfer Nodes
Richard Hughes-Jones
JISC 100 Gigabit Ethernet networking workshop
London, 04 July 2018
www.geant.org
• Work done as part of the AENEAS project
• 100 Gigabit Tests in the GÉANT Lab
• 100 Gigabit Tests with the DTNs on the GÉANT Network
• GÉANT DTN London to Paris
• Single TCP flow, Multiple TCP flow
• Stability
• Disk to Disk
• AENEAS DTNs Jodrell Bank to GÉANT DTN London
• TCP flow
• Libvma kernel bypass
• Reaching the TCP Limit on Long-Haul
• 100 Gigabit between GÉANT and AARNet
Agenda
Advanced European Network of E-infrastructures
for Astronomy with the SKA AENEAS - 731016
www.geant.org
100 Gigabit Tests in the GÉANT Lab
www.geant.org
• Core 6 on the socket with the PCIe to the NIC
• ConnectX-5 NICs B2B
• Rx ring buffer 4096
• Sending UDP packets and measure the rate.
• Drop of 12.6 Gbit/s at 7813 Bytes exits for NICs
other than Mellanox – artefact in Fedora 23 kernel.
• Smaller drop of 5.1 Gbit with FW ON
• Effect of firewall ~10 Gbit/s reduction.
udpmon_send: The effect of the Firewall
0
5
10
15
20
25
30
35
40
45
0 2000 4000 6000 8000 10000
SenduserdatarateGbit/s
Size of user data in packet bytes
pkt_size_send_GEANT-DTN1_22Jan18
FWall ON
Fwall OFF
www.geant.org
• B2B, no Firewall
• Core 6 is on the socket with the PCIe to the NIC
• ConnectX-5 NIC
• Rx ring buffer 4096
• RTT 0.4 µs
• Expected difference between cores
• Applications different too
• iperf3 while transmitting at 80 Gbit/s
CPU core 98% in kernel mode.
TCP Achievable Throughput: Which cores and application
Single TCP flow iperf2
Single TCP flow iperf3
www.geant.org
• B2B, no Firewall
• Core 6 is on the socket with the PCIe to the NIC
• ConnectX-5 NIC
• Packets dropped by the NIC
if Rx ring buffer < 4096
TCP Achievable Throughput: Size of ConnectX-5 ring buffer
Single TCP flows iperf3
% TCP re-transmitted segments rx 1024
www.geant.org
• Max packet size 4096 Bytes,
• Every message is acknowledged
• Messages ≥ 20k bytes, throughput > 90 Gbit/s
• CPU core 90% in user mode.
• App design needs to take care of ring buffers
• Only wait for send post completion
every 64 messages.
• Every 64 msg application takes ~42 µs
otherwise 0.1 to 0.2 µs.
RDMA RC
Time between sending messages with RDMA
Achievable throughput vs message spacing
www.geant.org
• Modified libvma to fragment correctly
• Standard udpmon applications
• UDP performance excellent
throughput >95 Gbit/s core6 – core6
• TCP performance poor:
• TCP iperf kernel 57.7 Gbit/s
• TCP iperf libvma 13.6 Gbit/s
UDP udpmon and libvma kernel bypass library
Smooth increase in BW vs packet size to > 90 Gbit/s
Throughput as a function of packet spacing
Inter-packet arrival times FWHM 2 µs
www.geant.org
100 Gigabit Tests with the DTNs on the GÉANT Network
www.geant.org
www.geant.org
Network Topology Connecting the DTNs
Can be Trunks
(multi VLANs)
www.geant.org
• London to Paris over GÉANT
• No Firewall
• Core 6 is on the socket with the PCIe to the NIC
• ConnectX-5 NIC
• Rx ring buffer 8192
• Throughput 43 Gbit/s for 7813 Byte packet
• Jitter 4 µs FWHM
• Some side lobes at ± 16 µs
due to cross traffic
• Good network stability.
UDP Performance over GÉANT: Throughput and Packet Jitter
Achievable UDP Throughput
Inter- packet arrival times
0
5
10
15
20
25
30
35
40
45
50
0 5 10 15 20
RecvWirerateGbit/s
Spacing between frames us
DTNLon-Par_100G_NOFW_03Jul18
4000 bytes
6000 bytes
7813 bytes
8972 bytes
0
500
1000
1500
2000
2500
3000
0 50 100 150 200 250 300
N(t)
Latency us
1472 bytes w= 80 DTNLon-Par_100G_NOFW_03Jul18
www.geant.org
• Route: London-London2-Paris
• TCP offload on, TCP cubic stack
• Firewalls ON
• RTT 7.5 ms.
• Delay Bandwidth Product 93.8 MB for a 100 Gbit/s flow.
• One TCP flow rises smoothly to the 36 Gbit/s plateau
at window of ~35 MBytes. (Includes Slowstart)
• Rate after slowstart 37.1 Gbit/s
• Plateau from 5s onwards
• NO TCP re-transmitted segments
• Achievable throughput limited by CPU not DBP
• Active core 100 % in kernel mode TCP buffer ≥ 40 MB
• Lab tests got ~60 Gbit/s
• FireWalls OFF improves by ~ 4 Gbit/s
100 Gigabit TCP Performance GÉANT London to Paris
0
5
10
15
20
25
30
35
40
0 5 10 15 20 25 30
BWGbit/s
Time in the flow sec
DTNLon-Par_100G_TCPbuf_03Jul18
3.0M
20M
40M
80M
100M
0
5
10
15
20
25
30
35
40
45
50
0 20 40 60 80 100
BWGbit/s
Buffer size Mbyte
DTNLon-Par_100G_TCPbuf_03Jul18
www.geant.org
• Firewalls ON
• Each flow on a different core
• 2 flows reach 70 Gbit/s
3 flows a stable 75 Gbit/s
• ≥ 3 flows 0.02 to 0.04 %
TCP re-transmissions
• CPU usage important
some cores ~80% kernel
TCP Performance Multiple Flows London – Paris with iperf
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
BWGbit/s
Buffer size Mbyte
DTNLon-Par_100G_a6-11_P1_TCPbuf_13Mar18
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
BWGbit/s
Buffer size Mbyte
DTNLon-Par_100G_a6-11_P2_TCPbuf_13Mar18
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
BWGbit/s
Buffer size Mbyte
DTNLon-Par_100G_a6-11_P4_TCPbuf_13Mar18
0
10
20
30
40
50
60
70
80
90
100
0 20 40 60 80 100
BWGbit/s
Buffer size Mbyte
DTNLon-Par_100G_a6-11_P6_TCPbuf_13Mar18
www.geant.org
• RTT 7.5 ms
• TCP buffer size 40 MBytes
• TCP throughput over 30 Hrs
• 32.5 Gbit/s
• No TCP segment re-transmissions
• Very stable
TCP Performance London – Paris 32 Gbit/s Single Flow Over GÉANT
0
5
10
15
20
25
30
35
40
0 20000 40000 60000 80000 100000 120000
BWGbit/s
Time during transfer sec
DTNLon-Par_TCP-tseries_15Mar18
www.geant.org
AENEAS DTNs & Network Topology Jodrell Bank to GÉANT
www.geant.org
• UDP throughput 99 Gbit/s for >7813 byte packets
• 10-20% packet loss 6 & 4k byte packets
spacing < 0.5 µs
• Packet Jitter very good
FWHM 2 µs no side lobes
100 Gigabit UDP udpmon with libvma GÉANT dtnLon to JBO
Achievable UDP throughput vs packet size
Packet loss as a function of packet spacing
Inter-packet arrival times FWHM 2 µs
0
10
20
30
40
50
60
70
80
90
100
0 2 4 6 8 10
RecvWirerateGbit/s
Spacing between frames us
DTNLon-Remus_VMA_a40_02Jul18
4000 bytes
6000 bytes
7813 bytes
8972 bytes
0.01
0.1
1
10
100
0 2 4 6 8 10
%Packetloss
Spacing between frames us
DTNLon-Remus_VMA_a40_02Jul18
4000 bytes
6000 bytes
7813 bytes
8972 bytes
0
1000
2000
3000
4000
5000
6000
7000
0 50 100 150 200 250 300
N(t)
Latency us
1472 bytes w= 80 DTNLon-Remus_VMA_a40_02Jul18
www.geant.org
• TCP offload on, TCP cubic stack
• Firewalls ON
• RTT 7.5 ms.
• Delay Bandwidth Product 93.8 MB for a 100 Gbit/s flow.
• One TCP flow rises smoothly to the 33 Gbit/s plateau
at window of ~30 MBytes. (Includes Slowstart)
• Rate after slowstart 32.8 Gbit/s
• Plateau from 5s onwards
• NO TCP re-transmitted segments
• Achievable throughput limited by CPU not DBP
• Active core 100 % in kernel mode TCP buffer ≥ 30 MB
• FireWalls OFF improves by ~ 4 Gbit/s
100 Gigabit TCP Performance JBO to GÉANT dtnLon
0
5
10
15
20
25
30
35
40
45
50
0 20 40 60 80 100
BWGbit/s
Buffer size Mbyte
remus-DTNLon_A9_TCPbuf_02May18
0
5
10
15
20
25
30
35
40
45
50
0 10 20 30 40 50 60
BWGbit/s
Time in the flow sec
remus-DTNLon_A9_TCPbuf_02May18
1M
8M
30M
60M
100M
www.geant.org
GridFTP tests between London and Paris
Disk-to-disk over 100Gb/s link
0
2
4
6
8
10
12
14
16
18
20
0.00 20.00 40.00 60.00
TransferrateGbit/s
File size GBytes
DTN1Lon - DTN2Par gridftp 1 Stream
y = 4E-10x + 0.3759
0
5
10
15
20
25
0.00 20.00 40.00 60.00
Transfertimes
File size GBytes
DTN1Lon - DTN2Par gridftp 1 Stream
• Same files as in FDT tests
• Time a linear increase with
File size from 5 to 50 Gbytes
• Small overhead time
• Transfer rate 19 Gbit/s
• Consistent with disk-memory
rate for 1 MVME disk
www.geant.org
Reaching the TCP Limit on Long-Haul
www.geant.org
The Network Path between GÉANT (Lon, Par) – AARNet (Canberra, MRO)
Thanks to Karl Meyer
www.geant.org
• Route GÉANT, ANA300, Internet2, & AARNet:
Paris-New York-Seattle-LosAngeles-Sydney-Canberra
• TCP offload on, TCP cubic stack
• RTT 303 ms.
• Delay Bandwidth Product 3.78 GB for 100 Gigabit
• One TCP flow rises smoothly to 26.1 Gbit/s
at 1023 MBytes including slowstart.
• No TCP re-transmitted segments
• Rate after slowstart 28.3 Gbit/s
• Plateau after ~15s
• Reach the limit of TCP protocol
Max TCP window is 1 Gbyte
• Rate for RTT 303 ms and TCP window 1023 MB
28.32 Gbit/s
• CPU core only 75-80 % in kernel mode
100 Gigabit between GÉANT Paris and AARNet Canberra
0
5
10
15
20
25
30
35
40
45
50
0 200 400 600 800 1000
BWGbit/s
Buffer size Mbyte
DTNPar-AARNetCan_TCPbuf_30May18
0
5
10
15
20
25
30
35
40
45
50
0 10 20 30 40 50 60
BWGbit/s
Time in the flow sec
DTNPar-AARNetCan_TCPbuf_30May18 50M
500M
750M
900M
1023M
www.geant.org
• To fix the Window size there is the Window Scale factor
negotiated at the SYN exchange. RFC 7323 (obsoletes 1323)
• Max value 14 → max Window (216 + 214 ) → 1024 MB
• Window size < Sequence number
• Deal with sequence number wrapping
• Allow to tell is a segment is old or new
The TCP Protocol Limit
232 → 4096 MB
TCP Header
216 → 64 kB
www.geant.org
0
5
10
15
20
25
30
35
40
0 200 400 600 800 1000
BWGbit/s
Buffer size Mbyte
GEANT-MRO TCP throughput • Route GÉANT, ANA300, Internet2, & AARNet:
Paris-New York-Seattle-LosAngeles-Sydney-Perth-MRO
• TCP offload on, TCP cubic stack
• Fedora 26 kernel 4.11.0-0.rc3.git0.2.fc26.x86_64
• RTT 279 ms.
• Delay Bandwidth Product 3.78 GB for 100 Gigabit
• One TCP flow rises smoothly to 28.7 Gbit/s
at 1023 MBytes including slowstart.
• No TCP re-transmitted segments
• Rate after slowstart 30.6 Gbit/s
• Reach the limit of TCP protocol
Max TCP window is 1 Gbyte
• Rate for RTT 279 ms and TCP window 1023 MB
30.7 Gbit/s
100 Gigabit between GÉANT Paris and AARNet MRO
0
5
10
15
20
25
30
35
40
45
50
0 10 20 30 40 50 60
BWGbit/s
Time in the flow sec
DTNPar-AARNetMRO_TCPbuf_01Jul18
50M
500M
750M
900M
1023M
www.geant.org
• Route GÉANT, ANA300, Internet2, & AARNet:
Paris-New York-Seattle-LosAngeles-Sydney-Canberra
• RTT 303 ms.
• TCP window 1023 MB.
• Two 4 minute TCP flows
• Second flow started 30s after the first
• Each flow stable at 28.3 Gbit/s
• Total transfer rate 56.6 Gbit/s
• 1.55 Tbytes data sent in 4.5mins.
• No TCP segments re-transmitted.
100 Gigabit: Multiple flows between GÉANT and AARNet
0
10
20
30
40
50
60
0 50 100 150 200 250 300
BWGbit/s
Time during transfer sec
exp1-AARNet_TCP_teries_04Feb17
www.geant.org
Questions
www.geant.org
Any Questions?
www.geant.org
© GEANT Limited on behalf of the GN4 Phase 2 project (GN4-2).
The research leading to these results has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 731122 (GN4-2).
Richard.Hughes-Jones@geant.org
Advanced European Network of E-infrastructures
for Astronomy with the SKA AENEAS - 731016
www.geant.org
NIC
NVME
• A lot of help from:
• Boston Labs (London UK)
• Mellanox (UK & Israel )
• MotherBoard: Supermicro X10DRT-i+
• Intel C612 chipset
• CPU: Two 6 core 3.40GHz
Xeon E5-2643 v3 processors
• RAM: 128 GB DDR4 2133MHz ECC
• NIC: Mellanox ConnectX-4 100 GE
• NIC: Mellanox ConnectX-5 100 GE
• 16 lane PCI-e
• As many interrupts as cores
• Driver MLNX_OFED-3.2-2.0.0.0
• Storage: 6 x Intel DC P3700 400GB NVMe SSDs
• Set 8 lane PCI-e
• Fedora 23
• 4.4.6-300.fc23.x86_64 kernel
DTN Hardware Description

Data transfer experiments over 100G paths

  • 1.
    www.geant.org Data Transfer Experimentsover 100G Paths using Data Transfer Nodes Richard Hughes-Jones JISC 100 Gigabit Ethernet networking workshop London, 04 July 2018
  • 2.
    www.geant.org • Work doneas part of the AENEAS project • 100 Gigabit Tests in the GÉANT Lab • 100 Gigabit Tests with the DTNs on the GÉANT Network • GÉANT DTN London to Paris • Single TCP flow, Multiple TCP flow • Stability • Disk to Disk • AENEAS DTNs Jodrell Bank to GÉANT DTN London • TCP flow • Libvma kernel bypass • Reaching the TCP Limit on Long-Haul • 100 Gigabit between GÉANT and AARNet Agenda Advanced European Network of E-infrastructures for Astronomy with the SKA AENEAS - 731016
  • 3.
  • 4.
    www.geant.org • Core 6on the socket with the PCIe to the NIC • ConnectX-5 NICs B2B • Rx ring buffer 4096 • Sending UDP packets and measure the rate. • Drop of 12.6 Gbit/s at 7813 Bytes exits for NICs other than Mellanox – artefact in Fedora 23 kernel. • Smaller drop of 5.1 Gbit with FW ON • Effect of firewall ~10 Gbit/s reduction. udpmon_send: The effect of the Firewall 0 5 10 15 20 25 30 35 40 45 0 2000 4000 6000 8000 10000 SenduserdatarateGbit/s Size of user data in packet bytes pkt_size_send_GEANT-DTN1_22Jan18 FWall ON Fwall OFF
  • 5.
    www.geant.org • B2B, noFirewall • Core 6 is on the socket with the PCIe to the NIC • ConnectX-5 NIC • Rx ring buffer 4096 • RTT 0.4 µs • Expected difference between cores • Applications different too • iperf3 while transmitting at 80 Gbit/s CPU core 98% in kernel mode. TCP Achievable Throughput: Which cores and application Single TCP flow iperf2 Single TCP flow iperf3
  • 6.
    www.geant.org • B2B, noFirewall • Core 6 is on the socket with the PCIe to the NIC • ConnectX-5 NIC • Packets dropped by the NIC if Rx ring buffer < 4096 TCP Achievable Throughput: Size of ConnectX-5 ring buffer Single TCP flows iperf3 % TCP re-transmitted segments rx 1024
  • 7.
    www.geant.org • Max packetsize 4096 Bytes, • Every message is acknowledged • Messages ≥ 20k bytes, throughput > 90 Gbit/s • CPU core 90% in user mode. • App design needs to take care of ring buffers • Only wait for send post completion every 64 messages. • Every 64 msg application takes ~42 µs otherwise 0.1 to 0.2 µs. RDMA RC Time between sending messages with RDMA Achievable throughput vs message spacing
  • 8.
    www.geant.org • Modified libvmato fragment correctly • Standard udpmon applications • UDP performance excellent throughput >95 Gbit/s core6 – core6 • TCP performance poor: • TCP iperf kernel 57.7 Gbit/s • TCP iperf libvma 13.6 Gbit/s UDP udpmon and libvma kernel bypass library Smooth increase in BW vs packet size to > 90 Gbit/s Throughput as a function of packet spacing Inter-packet arrival times FWHM 2 µs
  • 9.
    www.geant.org 100 Gigabit Testswith the DTNs on the GÉANT Network
  • 10.
  • 11.
    www.geant.org Network Topology Connectingthe DTNs Can be Trunks (multi VLANs)
  • 12.
    www.geant.org • London toParis over GÉANT • No Firewall • Core 6 is on the socket with the PCIe to the NIC • ConnectX-5 NIC • Rx ring buffer 8192 • Throughput 43 Gbit/s for 7813 Byte packet • Jitter 4 µs FWHM • Some side lobes at ± 16 µs due to cross traffic • Good network stability. UDP Performance over GÉANT: Throughput and Packet Jitter Achievable UDP Throughput Inter- packet arrival times 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 RecvWirerateGbit/s Spacing between frames us DTNLon-Par_100G_NOFW_03Jul18 4000 bytes 6000 bytes 7813 bytes 8972 bytes 0 500 1000 1500 2000 2500 3000 0 50 100 150 200 250 300 N(t) Latency us 1472 bytes w= 80 DTNLon-Par_100G_NOFW_03Jul18
  • 13.
    www.geant.org • Route: London-London2-Paris •TCP offload on, TCP cubic stack • Firewalls ON • RTT 7.5 ms. • Delay Bandwidth Product 93.8 MB for a 100 Gbit/s flow. • One TCP flow rises smoothly to the 36 Gbit/s plateau at window of ~35 MBytes. (Includes Slowstart) • Rate after slowstart 37.1 Gbit/s • Plateau from 5s onwards • NO TCP re-transmitted segments • Achievable throughput limited by CPU not DBP • Active core 100 % in kernel mode TCP buffer ≥ 40 MB • Lab tests got ~60 Gbit/s • FireWalls OFF improves by ~ 4 Gbit/s 100 Gigabit TCP Performance GÉANT London to Paris 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 BWGbit/s Time in the flow sec DTNLon-Par_100G_TCPbuf_03Jul18 3.0M 20M 40M 80M 100M 0 5 10 15 20 25 30 35 40 45 50 0 20 40 60 80 100 BWGbit/s Buffer size Mbyte DTNLon-Par_100G_TCPbuf_03Jul18
  • 14.
    www.geant.org • Firewalls ON •Each flow on a different core • 2 flows reach 70 Gbit/s 3 flows a stable 75 Gbit/s • ≥ 3 flows 0.02 to 0.04 % TCP re-transmissions • CPU usage important some cores ~80% kernel TCP Performance Multiple Flows London – Paris with iperf 0 10 20 30 40 50 60 70 80 90 100 0 20 40 60 80 100 BWGbit/s Buffer size Mbyte DTNLon-Par_100G_a6-11_P1_TCPbuf_13Mar18 0 10 20 30 40 50 60 70 80 90 100 0 20 40 60 80 100 BWGbit/s Buffer size Mbyte DTNLon-Par_100G_a6-11_P2_TCPbuf_13Mar18 0 10 20 30 40 50 60 70 80 90 100 0 20 40 60 80 100 BWGbit/s Buffer size Mbyte DTNLon-Par_100G_a6-11_P4_TCPbuf_13Mar18 0 10 20 30 40 50 60 70 80 90 100 0 20 40 60 80 100 BWGbit/s Buffer size Mbyte DTNLon-Par_100G_a6-11_P6_TCPbuf_13Mar18
  • 15.
    www.geant.org • RTT 7.5ms • TCP buffer size 40 MBytes • TCP throughput over 30 Hrs • 32.5 Gbit/s • No TCP segment re-transmissions • Very stable TCP Performance London – Paris 32 Gbit/s Single Flow Over GÉANT 0 5 10 15 20 25 30 35 40 0 20000 40000 60000 80000 100000 120000 BWGbit/s Time during transfer sec DTNLon-Par_TCP-tseries_15Mar18
  • 16.
    www.geant.org AENEAS DTNs &Network Topology Jodrell Bank to GÉANT
  • 17.
    www.geant.org • UDP throughput99 Gbit/s for >7813 byte packets • 10-20% packet loss 6 & 4k byte packets spacing < 0.5 µs • Packet Jitter very good FWHM 2 µs no side lobes 100 Gigabit UDP udpmon with libvma GÉANT dtnLon to JBO Achievable UDP throughput vs packet size Packet loss as a function of packet spacing Inter-packet arrival times FWHM 2 µs 0 10 20 30 40 50 60 70 80 90 100 0 2 4 6 8 10 RecvWirerateGbit/s Spacing between frames us DTNLon-Remus_VMA_a40_02Jul18 4000 bytes 6000 bytes 7813 bytes 8972 bytes 0.01 0.1 1 10 100 0 2 4 6 8 10 %Packetloss Spacing between frames us DTNLon-Remus_VMA_a40_02Jul18 4000 bytes 6000 bytes 7813 bytes 8972 bytes 0 1000 2000 3000 4000 5000 6000 7000 0 50 100 150 200 250 300 N(t) Latency us 1472 bytes w= 80 DTNLon-Remus_VMA_a40_02Jul18
  • 18.
    www.geant.org • TCP offloadon, TCP cubic stack • Firewalls ON • RTT 7.5 ms. • Delay Bandwidth Product 93.8 MB for a 100 Gbit/s flow. • One TCP flow rises smoothly to the 33 Gbit/s plateau at window of ~30 MBytes. (Includes Slowstart) • Rate after slowstart 32.8 Gbit/s • Plateau from 5s onwards • NO TCP re-transmitted segments • Achievable throughput limited by CPU not DBP • Active core 100 % in kernel mode TCP buffer ≥ 30 MB • FireWalls OFF improves by ~ 4 Gbit/s 100 Gigabit TCP Performance JBO to GÉANT dtnLon 0 5 10 15 20 25 30 35 40 45 50 0 20 40 60 80 100 BWGbit/s Buffer size Mbyte remus-DTNLon_A9_TCPbuf_02May18 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 60 BWGbit/s Time in the flow sec remus-DTNLon_A9_TCPbuf_02May18 1M 8M 30M 60M 100M
  • 19.
    www.geant.org GridFTP tests betweenLondon and Paris Disk-to-disk over 100Gb/s link 0 2 4 6 8 10 12 14 16 18 20 0.00 20.00 40.00 60.00 TransferrateGbit/s File size GBytes DTN1Lon - DTN2Par gridftp 1 Stream y = 4E-10x + 0.3759 0 5 10 15 20 25 0.00 20.00 40.00 60.00 Transfertimes File size GBytes DTN1Lon - DTN2Par gridftp 1 Stream • Same files as in FDT tests • Time a linear increase with File size from 5 to 50 Gbytes • Small overhead time • Transfer rate 19 Gbit/s • Consistent with disk-memory rate for 1 MVME disk
  • 20.
  • 21.
    www.geant.org The Network Pathbetween GÉANT (Lon, Par) – AARNet (Canberra, MRO) Thanks to Karl Meyer
  • 22.
    www.geant.org • Route GÉANT,ANA300, Internet2, & AARNet: Paris-New York-Seattle-LosAngeles-Sydney-Canberra • TCP offload on, TCP cubic stack • RTT 303 ms. • Delay Bandwidth Product 3.78 GB for 100 Gigabit • One TCP flow rises smoothly to 26.1 Gbit/s at 1023 MBytes including slowstart. • No TCP re-transmitted segments • Rate after slowstart 28.3 Gbit/s • Plateau after ~15s • Reach the limit of TCP protocol Max TCP window is 1 Gbyte • Rate for RTT 303 ms and TCP window 1023 MB 28.32 Gbit/s • CPU core only 75-80 % in kernel mode 100 Gigabit between GÉANT Paris and AARNet Canberra 0 5 10 15 20 25 30 35 40 45 50 0 200 400 600 800 1000 BWGbit/s Buffer size Mbyte DTNPar-AARNetCan_TCPbuf_30May18 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 60 BWGbit/s Time in the flow sec DTNPar-AARNetCan_TCPbuf_30May18 50M 500M 750M 900M 1023M
  • 23.
    www.geant.org • To fixthe Window size there is the Window Scale factor negotiated at the SYN exchange. RFC 7323 (obsoletes 1323) • Max value 14 → max Window (216 + 214 ) → 1024 MB • Window size < Sequence number • Deal with sequence number wrapping • Allow to tell is a segment is old or new The TCP Protocol Limit 232 → 4096 MB TCP Header 216 → 64 kB
  • 24.
    www.geant.org 0 5 10 15 20 25 30 35 40 0 200 400600 800 1000 BWGbit/s Buffer size Mbyte GEANT-MRO TCP throughput • Route GÉANT, ANA300, Internet2, & AARNet: Paris-New York-Seattle-LosAngeles-Sydney-Perth-MRO • TCP offload on, TCP cubic stack • Fedora 26 kernel 4.11.0-0.rc3.git0.2.fc26.x86_64 • RTT 279 ms. • Delay Bandwidth Product 3.78 GB for 100 Gigabit • One TCP flow rises smoothly to 28.7 Gbit/s at 1023 MBytes including slowstart. • No TCP re-transmitted segments • Rate after slowstart 30.6 Gbit/s • Reach the limit of TCP protocol Max TCP window is 1 Gbyte • Rate for RTT 279 ms and TCP window 1023 MB 30.7 Gbit/s 100 Gigabit between GÉANT Paris and AARNet MRO 0 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 60 BWGbit/s Time in the flow sec DTNPar-AARNetMRO_TCPbuf_01Jul18 50M 500M 750M 900M 1023M
  • 25.
    www.geant.org • Route GÉANT,ANA300, Internet2, & AARNet: Paris-New York-Seattle-LosAngeles-Sydney-Canberra • RTT 303 ms. • TCP window 1023 MB. • Two 4 minute TCP flows • Second flow started 30s after the first • Each flow stable at 28.3 Gbit/s • Total transfer rate 56.6 Gbit/s • 1.55 Tbytes data sent in 4.5mins. • No TCP segments re-transmitted. 100 Gigabit: Multiple flows between GÉANT and AARNet 0 10 20 30 40 50 60 0 50 100 150 200 250 300 BWGbit/s Time during transfer sec exp1-AARNet_TCP_teries_04Feb17
  • 26.
  • 27.
    www.geant.org Any Questions? www.geant.org © GEANTLimited on behalf of the GN4 Phase 2 project (GN4-2). The research leading to these results has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 731122 (GN4-2). Richard.Hughes-Jones@geant.org Advanced European Network of E-infrastructures for Astronomy with the SKA AENEAS - 731016
  • 28.
    www.geant.org NIC NVME • A lotof help from: • Boston Labs (London UK) • Mellanox (UK & Israel ) • MotherBoard: Supermicro X10DRT-i+ • Intel C612 chipset • CPU: Two 6 core 3.40GHz Xeon E5-2643 v3 processors • RAM: 128 GB DDR4 2133MHz ECC • NIC: Mellanox ConnectX-4 100 GE • NIC: Mellanox ConnectX-5 100 GE • 16 lane PCI-e • As many interrupts as cores • Driver MLNX_OFED-3.2-2.0.0.0 • Storage: 6 x Intel DC P3700 400GB NVMe SSDs • Set 8 lane PCI-e • Fedora 23 • 4.4.6-300.fc23.x86_64 kernel DTN Hardware Description