Dumb Ways To Die: How Not To Write TCP-based Network Applications

793 views

Published on

Thanks to http://www.youtube.com/watch?v=IJNR2EpS0jw :-)

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
793
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
9
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Dumb Ways To Die: How Not To Write TCP-based Network Applications

  1. 1. HOW[NOT]TO Write TCP-basedNetwork Applications Artyom Gavrichenkov 1/x
  2. 2. Based on a True Story• NOT AN AD!• Qrator: distributed network ● Custom TCP/IP at the bottom ● Custom management protocol at the top ● Interacting with plenty of Web servers and Web browsers on a daily basis ● 2 years of continuous debug^W Product ImprovementTM 2
  3. 3. Issue #1• Message delivery is unreliable in TCP.
  4. 4. Issue #1• Message delivery is unreliable in TCP: theres no estimation on when (and if) the message will arrive at all• Timeouts!• Limit all resources, including time• No action is itself an action
  5. 5. Timeouts• Between recvfrom()• Between requests• Request timeout• Lifetime of a session• Lifetime of %OBJECTNAME%• Long polling may be a bad idea
  6. 6. Ex. 1• Slowloris (Apache): DoS ● (not distributed, just denial of service)• Slow HTTP POST ● Apache, IIS, Lighttpd: DoS ● Nginx: DDoS with a botnet
  7. 7. Ex. 212 rpm AJAX page update ● Backup script switched the server off
  8. 8. Content-Length– Limit resources for all actions– Custom protocol should define limits on the input length
  9. 9. errno(3)– The connection may be closed for no good reason– Check errno after recvfrom(), sendto(), etc. ● ENOMEM ● ECONNRESET ● EANYTHING
  10. 10. Ex. 3● Internet Explorer: ECONNRESET means successful connection termination – Download status is being ignored – Content-Length is being ignored
  11. 11. Memory limits– Resource limits: ● Maximum – ENOMEM ● Minimum – idle wait → ECONNRESET
  12. 12. Ex. 4– DNS TTL ● Too big: days of downtime (continuous) ● Too small: days of downtime (total)
  13. 13. Latency– 3-Way Handshake takes time– Do implement persistent connections! ● Do it from the very beginning
  14. 14. They havent listened to me!● TCP – T/TCP● HTTP/1.0 – HTTP/1.1
  15. 15. Optimization– Measure!– Profile!– Emulate packet loss!
  16. 16. Optimization– Text-based protocols are convenient to debug ● And you will debug – Maybe even in production– Making use of binary protocols is often a premature optimization ● BSON, Google Protocol Buffers
  17. 17. Optimization● TCP socket options: – TCP_NODELAY: disables Nagles algorithm ● Speedup with small portions of data – TCP_CORK (Linux): multiple portions of data in a single TCP segment – "socket corking"
  18. 18. Optimization● TCP stack options: – Linux: /proc/sys/net/** ● net.ipv4.tcp_fin_timeout ● net.ipv4.tcp_{,r,w}mem ● net.core.{r,w}mem_max – Windows: HKLMSystemCurrentControlSetServicesTcpipParameters
  19. 19. IPv6● Accidental IPv6 deployment
  20. 20. • SO_REUSEADDR• sendfile(2)• select(2)/poll(2)/epoll(7)• {n,h}to{n,h}{s,l}()• int64_t vs long 21
  21. 21. This is it!Artyom Gavrichenkov <ximaera@highloadlab.com>

×