Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

5,929 views

Published on

Tuning your EC2 web server will help you to improve application server throughput and cost-efficiency as well as reduce request latency. In this session we will walk through tactics to identify bottlenecks using tools such as CloudWatch in order to drive the appropriate allocation of EC2 and EBS resources. In addition, we will also be reviewing some performance optimizations and best practices for popular web servers such as Nginx and Apache in order to take advantage of the latest EC2 capabilities.

Published in: Technology

(WEB401) Optimizing Your Web Server on AWS | AWS re:Invent 2014

  1. 1. HTTP
  2. 2. •Optimize the web server stack
  3. 3. •Remember: optimizations by definition are app-specific
  4. 4. CloudWatch 0 1 2 3 4 5 10:00 10:01 10:02 10:03 10:04 10:05 10:06 10:07 10:08 10:09 10:10 10:11 10:12 10:13 10:14 10:15 Average request size Average request size Filters
  5. 5. 0 50 100 150 200 250 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 Latency at percentile Average Latency 0 200 400 600 800 1000 1200 1400 1600 1800 2000 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 55 204 207 210 Latency histogram Frequency
  6. 6. 0 5 10 15 20 25 Category 1 Chart Title response_processing_time request_processing_time backend_processing_time
  7. 7. 0 5 10 15 20 25 Average latency by type GET POST 2.85 2.9 2.95 3 3.05 3.1 3.15 3.2 3.25 Average latency Total
  8. 8. •Whatever makes most sense to you!
  9. 9. Justin Lintz
  10. 10. Who am I? •Senior Web Operations Engineer at Chartbeat •Previously worked at –Bitly –TheStreet.com –Corsis @lintzston justin@chartbeat.com
  11. 11. Chartbeat measures and monetizes attention on the web. Working with 80% of the top US news sites and global media sites in 50 countries, Chartbeat brings together editors and advertisers to identify in real time the active time an audience consumes articles, videos, paid content, and display advertising.
  12. 12. http://chartbeat.com/publishing/demo
  13. 13. •400–500 servers •Peak traffic: 275,000requests/second •11–12 millionconcurrent users across all sites in our network
  14. 14. http://chartbeat.com/totaltotal
  15. 15. Traffic characteristics Every 15 seconds 213byte request + headers 43byte, response size
  16. 16. Logs
  17. 17. Logging not “free” Sequential writes are fast Logs grow and then...
  18. 18. What do you do with them? •Rotate •Compress •Ship them elsewhere? All impact latency of your requests!
  19. 19. Gzip impact on request latency ●8 GB file ●Default GZIP compression settings ●EXT4 ●C3.xlarge on SSD ephemeral storage
  20. 20. Simple tweaks
  21. 21. Hourly rotate •Logrotate doesn’t support out of box 0 * * * * /usr/sbin/logrotate -f /etc/logrotate.d/nginx > /dev/null 2>&1 Goal: smaller latency spikes spread throughout day
  22. 22. Avoid compression •But if you must, use –LZ4 –LZO –Snappy Order of magnitude faster than gzip or bzip2, fraction of the CPU
  23. 23. Extent-based file system EXT4 or XFS
  24. 24. SSD •GP2 Amazon EBS volumes •New generation Amazon EC2 instance types –C3 –M3 –R3 –I2
  25. 25. More involved tweaks
  26. 26. Stream logs via Syslog •Max 1 KB line length per RFC3164 •Only supported in Nginx 1.7.1+ •Apache supported via CustomLog piping to logger
  27. 27. Only log at load balancer •Only one side of picture •Can’t log custom headers or format logs •Logs are delayed
  28. 28. Pull node on rotate •Using prerotate/postrotate in logrotate –Pull node from ELB via API and place back on completion •Requires staggering nodes •Probably not worth the effort?
  29. 29. Sysctl tweaks
  30. 30. Listen queue backlog net.core.somaxconn = 128 Apache:ListenBackLog 511 Nginx: listen backlog=511 should be larger
  31. 31. man listen(2) If the backlogargument is greater than the value in /proc/sys/net/core/somaxconn, thenit is silently truncated to that value; the default value in this file is 128. In kernels before 2.4.25, this limit was a hard- coded value, SOMAXCONN, with the value 128.
  32. 32. Additional TCP backlog •net.core.netdev_max_backlog = 1000 –Per CPU backlog –Network frames •net.ipv4.tcp_max_syn_backlog = 128 •Half-open connections
  33. 33. Initial congestion window TCP congestion window -initcwnd (initial) Starting in Kernel 2.6.39, set to 10 Previous default was 3! http://research.google.com/pubs/pub36640.html Older Kernel? $ ip route change default via 192.168.1.1 dev eth0 proto static initcwnd 10
  34. 34. net.ipv4.tcp_slow_start_after_idle •Set to 0 to ensure connections don’t go back to default TCP window size after being idle too long Example: HTTP KeepAlive
  35. 35. TIME_WAIT sockets
  36. 36. net.ipv4.tcp_max_tw_buckets •Max number of sockets in TIME_WAIT. We actually set this very high, because before we moved instances behind a load balancer it was normal to have 200K+ sockets in TIME_WAITstate. •Exceeding this leads to sockets being torn down until under limit
  37. 37. net.ipv4.tcp_fin_timeout •The time a connection should spend in FIN_WAIT_2state. Default is 60 seconds, lowering this will free memory more quickly and transition the socket to TIME_WAIT. •This will NOT reduce the time a socket is in TIME_WAITwhich is set to 2 * MSL (max segment lifetime).
  38. 38. net.ipv4.tcp_fin_timeout continued... MSL is hardcoded in the kernel at 60 seconds! https://github.com/torvalds/linux/blob/master/include/ net/tcp.h#L115 #define TCP_TIMEWAIT_LEN (60*HZ) /* how long to wait to destroy TIME-WAIT * state, about 60 seconds*/
  39. 39. “If it is on the Internet then it must be true, and you can’t question it” —Abraham Lincoln
  40. 40. net.ipv4.tcp_tw_recycle DANGEROUS •Clients behind NAT/stateful FW will get dropped •*99.99999999% of time should never be enabled * Probably 100%, but there may be a valid case out there
  41. 41. net.ipv4.tcp_tw_reuse Makes a safer attempt at freeing sockets in TIME_WAITstate
  42. 42. Recycle vs. reuse deep dive http://bit.ly/tcp-time-wait
  43. 43. net.ipv4.tcp_rmem/wmem Format: min default max(in bytes) •The kernel will autotune the number of bytes to use for each socket based on these settings. It will start at defaultand work between the minand max
  44. 44. net.ipv4.tcp_mem Format: low pressure max (in pages!) •Below low, Kernel won’t put pressure on sockets to reduce mem usage. When pressure hits, sockets reduce memory until lowis hit. If maxhits, no new sockets.
  45. 45. Additional readingshttps://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt man tcp(7)
  46. 46. Nginx/Apache
  47. 47. listen backlog Apache: ListenBackLog 511 Nginx: listen backlog=511 –limited by net.core.somaxconn
  48. 48. tcp_defer_accept Apache: AcceptFilterhttp dataAcceptFilterhttps data Nginx: listen [deferred] –Wait till we receive data packet before passing socket to server. Completing TCP handshake won’t trigger an accept()
  49. 49. sendfile Apache: EnableSendfile off Nginx: sendfile off –Saves context switching from userspace on read/write –“zero copy”; happens in kernel space
  50. 50. tcp_cork Apache: Enabled w/ sendfile Nginx: tcp_nopush off –aka TCP_CORKsockopt –allows application to control building of packet; e.g., pack a packet with full HTTP response –Only works with sendfile
  51. 51. tcp_nodelay (Nagle’s algo) Apache: On •No ability to turn off Nginx: tcp_nodelay on •Only affects keep-alive connections •Will add latency if turned off in favor of bandwidth
  52. 52. HTTP Keep-Alive Apache: KeepAlive On KeepAliveTimeout 5 MaxKeepAliveRequests 100 Nginx: keepalive_timeout 75s keepalive_requests 100 Note: If using ELB you must match the timeout to the the ELB timeout setting
  53. 53. HTTP Keep-Alive •Also enable on upstream proxies –Available since Nginx 1.1.4 proxy_http_version 1.1; proxy_set_header Connection ""; upstream foo { server 10.1.1.1; keepalive 1024; }
  54. 54. HTTP Keep-Alive
  55. 55. everythingyour quantifiablecontinuously
  56. 56. Please give us your feedback on this session. Complete session evaluations and earn re:Invent swag. http://bit.ly/awsevals

×