Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Programming TCP
for responsiveness
DeNA Co., Ltd.
Kazuho Oku
1
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Typical sequence of HTTP/2
2Programming TCP for responsivesess
HTTP/...
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
HOB caused by buffers
3Programming TCP for responsivesess
 TCP send...
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Reduce poll threshold
 setsockopt(TCP_NOTSENT_LOWAT)
⁃ in linux, th...
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
How can we fill up the CWND?
 idea: do smaller writes until epoll t...
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Solution: read TCP states
6Programming TCP for responsivesess
CWND
u...
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Negative impact of additional delay
 increased delay bet. ACK recv....
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Countermeasures
 optimize for responsiveness only when necessary
⁃ ...
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Configuration Directives
 http2-latopt-min-rtt
⁃ minimum TCP RTT to...
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Pseudo-code
size_t get_suggested_write_size() {
getsockopt(fd, IPPRO...
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Benchmark (1)
11Programming TCP for responsivesess
 conditions:
⁃ s...
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Benchmark (2)
 using same data as previous
 server: Sakura VPS (Is...
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Conclusion
 near-optimal result can be achieved
⁃ by adjusting poll...
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Under the hood
14Programming TCP for responsivesess
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
TCP_NOTSENT_LOWAT
 supported by Linux, OS X
 on Linux:
⁃ sysctl:
•...
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Unit of CWND
 Linux: # of packets
⁃ if INITCWND is 10, you can send...
Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved.
Determining amount of data that can be
sent immediately
OS MSS CWND ...
Upcoming SlideShare
Loading in …5
×

Programming TCP for responsiveness

874 views

Published on

blah

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Programming TCP for responsiveness

  1. 1. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Programming TCP for responsiveness DeNA Co., Ltd. Kazuho Oku 1
  2. 2. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Typical sequence of HTTP/2 2Programming TCP for responsivesess HTTP/2 200 OK <!DOCTYPE HTML> … <SCRIPT SRC=”jquery.js”> … client server GET / GET /jquery.js need to switch sending from HTML to JS at this very moment (means that amount of data sent in * must be smaller than IW)1RTT *
  3. 3. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. HOB caused by buffers 3Programming TCP for responsivesess  TCP send buffer: ⁃ to reduce ping-pong bet. kernel and application  BIO buffer: ⁃ for data that couldn’t be stored in TCP send buffer TCP send buffer CWND unacked poll threshold BIO buf. TLS Records sent immediately not immediately sent HTTP/2 frames
  4. 4. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Reduce poll threshold  setsockopt(TCP_NOTSENT_LOWAT) ⁃ in linux, the minimum is CWND + 1 octet • becomes unstable when set to CWND + 0 4Programming TCP for responsivesess TCP send buffer CWND unacked poll threshold BIO buf. TLS Records sent immediately not immediately sent HTTP/2 frames
  5. 5. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. How can we fill up the CWND?  idea: do smaller writes until epoll tells you its full  the issue with the idea: ⁃ CPU intensive ⁃ data overflowed from CWND might get sent as a small packet (wasting packets during slow start!) ⁃ overhead of TLS header & HTTP frame becomes bigger 5Programming TCP for responsivesess
  6. 6. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Solution: read TCP states 6Programming TCP for responsivesess CWND unacked poll threshold // calc size of data to send by calling getsockopt(TCP_INFO) if (poll_for_write(fd) == SOCKET_IS_READY) { capacity = CWND - unacked + TWO_MSS - TLS_overhead; SSL_write(prepare_http2_frames(capacity)); } TLS Records sent immediately not immediately sent HTTP/2 frames TCP send buffer
  7. 7. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Negative impact of additional delay  increased delay bet. ACK recv. → data send, since: ⁃ traditional approach: completes within kernel ⁃ this approach: application needs to be notified to generate new data  outcome: ⁃ increase of CWND becomes slower ⁃ leads to slower peak speed? • depends on how CWND at peak is calculated ⁃ does kernel use TCP timestamp for the matter? 7Programming TCP for responsivesess
  8. 8. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Countermeasures  optimize for responsiveness only when necessary ⁃ i.e. when RTT is big and CWND is small ⁃ impact of optimization is RTT * unsent_bytes / CWND  disable optimization if additional delay is significant ⁃ when epoll returns immediately, estimated additional delay is equal to the time spent by the loop 8Programming TCP for responsivesess
  9. 9. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Configuration Directives  http2-latopt-min-rtt ⁃ minimum TCP RTT to enable the optimization ⁃ default: UINT_MAX (disabled)  http2-latopt-max-cwnd ⁃ maximum CWND to enable (in octets) ⁃ default: 65535  http2-max-additional-delay ⁃ max. additional delay (as the ratio to TCP RTT) ⁃ latopt disabled if the delay is greater ⁃ default: 0.1 9Programming TCP for responsivesess
  10. 10. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Pseudo-code size_t get_suggested_write_size() { getsockopt(fd, IPPROTO_TCP, TCP_INFO, &tcp_info, sizeof(tcp_info)); if (tcp_info.tcpi_rtt < min_rtt || tcp_info.tcpi_snd_cwnd > max_cwnd) return UNKNOWN; switch (SSL_get_current_cipher(ssl)->id) { case TLS1_CK_RSA_WITH_AES_128_GCM_SHA256: case …: tls_overhead = 5 + 8 + 16; break; default: return UNKNOWN; } packets_sendable = tcp_info.tcpi_snd_cwnd > tcp_info.tcpi_unacked ? tcp_info.tcpi_snd_cwnd - tcp_info.tcpi_unacked : 0; return (packets_sendable + 2) * (tcp_info.tcpi_snd_mss - tls_overhead); } 10Programming TCP for responsivesess
  11. 11. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Benchmark (1) 11Programming TCP for responsivesess  conditions: ⁃ server in Ireland, client in Tokyo (RTT 250ms) ⁃ load tiny js at the top of a large HTML  result: delay decreased from 511ms to 250ms ⁃ i.e. JS fetch latency was 2RTT, became 1 RTT • similar results in other environments
  12. 12. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Benchmark (2)  using same data as previous  server: Sakura VPS (Ishikari DC) 12Programming TCP for responsivesess 0 50 100 150 200 250 300 HTML JS milliseconds downloading HTML (and JS within) RTT ~25ms master latopt
  13. 13. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Conclusion  near-optimal result can be achieved ⁃ by adjusting poll threshold and reading TCP states ⁃ 1-packet overhead due to restriction in Linux kernel  1-RTT improvement in H2O ⁃ estimated 1-RTT improvement per the depth of the load graph 13Programming TCP for responsivesess
  14. 14. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Under the hood 14Programming TCP for responsivesess
  15. 15. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. TCP_NOTSENT_LOWAT  supported by Linux, OS X  on Linux: ⁃ sysctl: • set to -1: use kernel default • set to 0: sshd hangs • set to positive int: override kernel default ⁃ setsockopt: • set to 0: use default (sysctl or kernel) • set to int: override default 15Programming TCP for responsivesess
  16. 16. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Unit of CWND  Linux: # of packets ⁃ if INITCWND is 10, you can send at most 10 packets at once, regardless of their size  BSD (incl. OS X): octets ⁃ you can send CWND*MSS octets, regardless of the number of packets • if CWND=10 and MSS=1460, it is possible to send 14,600 packets containing 1-octet payload 16Programming TCP for responsivesess
  17. 17. Copyright (C) 2016 DeNA Co.,Ltd. All Rights Reserved. Determining amount of data that can be sent immediately OS MSS CWND inflight send buffer (inflight + unsent) Linux tcpi_snd_mss tcpi_snd_cwnd* tcpi_snd_unacked* ioctl(SIOCOUTQ) OS X** tcpi_maxseg tcpi_snd_cwnd - tcpi_snd_sbbytes FreeBSD tcpi_snd_mss tcpi_snd_cwnd - ioctl(FIONWRITE) NetBSD tcpi_snd_mss tcpi_snd_cwnd* - ioctl(FIONWRITE) 17Programming TCP for responsivesess  calculate either of: ⁃ CWND - inflight ⁃ min(CWND - (inflight + unsent), 0)  units used in the calculation must be the same ⁃ NetBSD: fail *: units of values marked are packets, unmarked are octets **: sometimes the values of tcpi_* are returned as zeros

×