Choosing A Proxy Server - Apachecon 2014
Upcoming SlideShare
Loading in...5
×
 

Choosing A Proxy Server - Apachecon 2014

on

  • 1,056 views

 

Statistics

Views

Total Views
1,056
Views on SlideShare
1,055
Embed Views
1

Actions

Likes
2
Downloads
33
Comments
0

1 Embed 1

https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • A reverse proxy, aka a web accelerator, does not require the browser to cooperate in any special way. As far as the user (browser) is concerned, it looks like it’s talking to any other HTTP web server on the internet. The reverse proxy server on the other hand must be explicitly configured for what traffic it should handle, and how such requests are properly routed to the backend servers (aka. Origin Servers). Just as with a forward proxy, many reverse proxies are configured to cache content locally. It can also help load balancing and redundancy on the Origin Servers, and help solve difficult problems like Ajax routing.
  • * Before we go into details of what drives Traffic Server, and how we use it, let me briefly discuss the three most common proxy server configurations.* In a forward proxy, the web browser has to be manually (or via auto-PAC files etc.) configured to use a proxy server for all (or some) requests. The browser typically sends the “full” URL as part of the GET request.The forward proxy typically is not required to be configured for “allowed” destination addresses, but can be configured with Access Control List, or blacklists controlling what requests are allowed, and by whom. A forward proxy is typically allowed to cache content, and a common use case scenario is inside corporate firewalls.
  • An intercepting proxy, also commonly called a transparent proxy, is very similar to a forward proxy, except the client (browser) does not require any special configuration. As far as the user is concerned, the proxying happens completely transparently. A transparent proxy will intercerpt the HTTP requests, modify them accordingly, and typically “forge” the source IP before forwarding the request to the final destination. Transparent proxies usually also implements traffic filters and monitoring, allowing for strict control of what HTTP traffic passes through the mandatory proxy layer. Typical use cases include ISPs and very strictly controlled corporate firewalls. I’m very excited to announce that as of a few days ago, code for transparent proxy is available in the subversion tree.
  • Squid – SPDY not on roadmap- http://wiki.squid-cache.org/Squid-3.5 or in the bugs for 3.5 – no progress http://wiki.squid-cache.org/Features/HTTP2ESI – Edge Side Includes - http://en.wikipedia.org/wiki/Edge_Side_IncludesICP - Internet Cache Protocol -http://www.ietf.org/rfc/rfc2186.txthttpd - mod_spdy uses Chromium's SpdyFramer class to encode and decode SPDY frames.
  • https://istlsfastyet.com/ - IlyaGrigorik
  • NGiNX – doesn’t handle accept-encoding or vary at all
  • Multithreading allows a process to split itself, and run multiple tasks in “parallel”. There is significantly less overhead running threads compared to individual processes, but threads are still not free. They need memory resources, and incur context switches. It’s a known methodology for solving the concurrency problem, and many, many server implementations relies heavily on threads. Modern OS’es have good support for threads, and standard libraries are widely available.
  • Events are scheduled by the event loop, and event handlers execute specific code for specific events This makes it easier to code for, there’s no risk of deadlock or race condition Can handle a good number of connections (but not unlimited) Squid is a good example of an event driven server.
  • Events are scheduled by the event loop, and event handlers execute specific code for specific events This makes it easier to code for, there’s no risk of deadlock or race condition Can handle a good number of connections (but not unlimited) Squid is a good example of an event driven server.
  • Squid - 72 or 104 bytes of metadata in memory for every object in your cache. http://wiki.squid-cache.org/SquidFaq/SquidMemory#Why_does_Squid_use_so_much_memory.21.3FATS – 10 bytes
  • Squid – ufs (filesystem) – rock store (database style)Varnish – since it is a mmap cache and the index is part of the mmap it has a in memory indexATS – Using a “cyclone cache” similar to a log based file system – merges writes less seeking
  • ATS – should auto config accept threads
  • NIGX – uses the least CPU, but has really bad latenciesATS – most tuses a lot less CPU then Squid, Varnish, httpd
  • VCL - https://www.varnish-cache.org/trac/wiki/VCLExamples
  • httpd - mod_spdy uses Chromium's SpdyFramer class to encode and decode SPDY frames.

Choosing A Proxy Server - Apachecon 2014 Choosing A Proxy Server - Apachecon 2014 Presentation Transcript

  • Choosing A Proxy Server ApacheCon 2014 Bryan Call ATS Committer / Yahoo
  • About Me • Yahoo! Employee – WebRing, GeoCities, Personals, Tiger Team, Platform Architect, Edge Team, Research, ATS and HTTP (HTTP/2 and TLS at IETF) • Working on Traffic Server for 7 years – Since 2007 • Part of the team that open sourced it in 2009 • ATS Committer
  • Overview • Types of Proxies • Features • Architecture • Cache Architecture • Performance • Pros and Cons
  • How are you going to use a proxy server?
  • Reverse Proxy
  • Reverse Proxy • Proxy in front of your own web servers • Caching? • Geographic location? • Connection handling? • SSL termination? • SPDY support? • Adding business logic?
  • Forward Proxy
  • Intercepting Proxy
  • Forward / Intercepting Proxy • Proxy in front of the Internet • Configure clients to use proxy? • Caching? • SSL - CONNECT? • SSL - termination?
  • Choices
  • Plenty of Proxy Servers PerlBal
  • Plenty of Proxy Servers
  • Features And Options
  • Features ATS NGiNX Squid Varnish Apache httpd mod_proxy Reverse Proxy Y Y Y Y Y Forward Proxy Y N Y N Y Transp. Proxy Y N Y N Y Plugin APIs Y Y partial Y Y Cache Y Y Y Y Y ESI Y N Y partial N ICP Y N Y N N SSL Y Y Y N Y SPDY Y* Y N N partial * 5.0.0 (May 2014)
  • SSL Features Source: https://istlsfastyet.com/ - Ilya Grigorik
  • What type of proxy do you need? • Of our candidates, only three fully supports all proxy modes
  • HTTP/1.1 Compliance
  • HTTP/1.1 Compliance • Accept-Encoding - gzip • Vary • Age • If-None-Match
  • How things can go wrong: Vary $ curl -D - -o /dev/null -s --compress http://10.118.73.168/ HTTP/1.1 200 OK Server: nginx/1.3.9 Date: Wed, 12 Dec 2012 18:00:48 GMT Content-Type: text/html; charset=utf-8 Content-Length: 8051 Connection: keep-alive Cache-Control: public, max-age=900 Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000 Expires: Sun, 19 Nov 1978 05:00:00 GMT Vary: Cookie,Accept-Encoding Content-Encoding: gzip
  • How things can go wrong: Vary $ curl -D - -o /dev/null -s http://10.118.73.168/ HTTP/1.1 200 OK Server: nginx/1.3.9 Date: Wed, 12 Dec 2012 18:00:57 GMT Content-Type: text/html; charset=utf-8 Content-Length: 8051 Connection: keep-alive Cache-Control: public, max-age=900 Last-Modified: Wed, 12 Dec 2012 17:52:42 +0000 Expires: Sun, 19 Nov 1978 05:00:00 GMT Vary: Cookie,Accept-Encoding Content-Encoding: gzip EPIC FAIL! Note: no gzip request
  • CoAdvisor HTTP protocol quality tests for reverse proxies 0 100 200 300 400 500 600 ATS 3.3.1 Nginx 1.3.9 Squid 3.2.5 Varnish 3.0.3 Failures Violations Success 49% 81% 51% 68%
  • CoAdvisor HTTP protocol quality tests for reverse proxies 0 100 200 300 400 500 600 ATS 3.3.1 Nginx 1.3.9 Squid 3.2.5 Varnish 3.0.3 Failures Violations Success 25% 6% 27% 15%
  • Architecture
  • Architecture And Process Models • Multithreading • Events • Process • Fibers – Co-operative multitasking, getcontext/setcontext
  • Threads Thread 1 Thread 2 Thread 3 Thread 1 Thread 3 Time Single CPU Thread 1 Thread 2 Thread 3 Thread 1 Thread 3 Time Dual CPU
  • Threads • Pros – Easy to share memory – Lightweight context switching • Cons – Easy to (accidently) share memory • Overwriting another threads memory – Locking • Deadlocks, race conditions, starvation
  • Event Processing Event Loop Scheduled events Network events Disk I/O events Disk handler HTTP state machine Accept handler Queue Can generate new events
  • Problems with Event Processing • Doesn’t work well with blocking APIs – open(), locking • It doesn’t scale on SMP by itself
  • Process Model And Architecture ATS NGiNX Squid Varnish Apache httpd mod_proxy Threads X X X Events X X X partial X Processes X X X
  • Caching Architecture
  • Cache • Mainly two types – File system – Database like • In memory index – Bytes per object • Minimize disk seeks and system calls
  • Cache ATS NGiNX Squid Varnish Apache httpd mod_cache File system X X X mmap X Raw disk/direct IO X X Ram cache X X Memory index X X X* Persistent cache X X X X
  • Performance Testing
  • ATS Configuration etc/trafficserver/remap.config: map / http://origin.example.com etc/trafficserver/records.config: CONFIG proxy.config.http.server_ports STRING 80 CONFIG proxy.config.accept_threads INT 3
  • NGiNX Configuration worker_processes 24; access_log logs/access.log main; proxy_cache_path /mnt/nginx_cache levels=1:2 keys_zone=my-cache:8m max_size=16384m inactive=600m; proxy_temp_path /mnt/nginx_temp; server { set $ae ""; if ($http_accept_encoding ~* gzip) { set $ae "gzip"; } location / { proxy_pass http://origin.example.com; proxy_cache my-cache; proxy_set_header If-None-Match ""; proxy_set_header If-Modified-Since ""; proxy_set_header Accept-Encoding $ae; proxy_cache_key $uri$is_args$args$ae; } location ~ /purge_it(/.*) { proxy_cache_purge example.com $1$is_args$args$myae }
  • Squid Configuration http_access allow all http_port 80 accel workers 24 cache_mem 4096 MB memory_cache_shared on cache_dir rock /usr/local/squid/cache 1000 max-size=32768 cache_peer origin.example.com parent 80 0 no-query originserver
  • Varnish Configuration backend default { .host = ”origin.example.com”; .port = "80"; }
  • Varnish Configuration (Cont) sudo /usr/local/sbin/varnishd -f /usr/local/etc/varnish/default.vcl -p thread_pool_max=4000 sudo /usr/local/sbin/varnishd -f /usr/local/etc/varnish/default.vcl -p thread_pool_max=2000 -p thread_pool_add_delay=2 -p thread_pool_min=200 sudo /usr/local/sbin/varnishd -f /usr/local/etc/varnish/default.vcl -p thread_pool_max=2000 -p thread_pool_add_delay=2 -p thread_pool_min=1000 -p session_linger=0 sudo /usr/local/sbin/varnishd -f /usr/local/etc/varnish/default.vcl -p thread_pool_max=2000 -p thread_pool_add_delay=2 -p thread_pool_min=1000 -p session_linger=10
  • Apache httpd Configuration LoadModule cache_module modules/mod_cache.so LoadModule cache_disk_module modules/mod_cache_disk.so LoadModule proxy_module modules/mod_proxy.so LoadModule proxy_http_module modules/mod_proxy_http.so Include conf/extra/httpd-mpm.conf ProxyPass / http://origin.example.com/ <IfModule mod_cache_disk.c> CacheRoot /usr/local/apache2/cache CacheEnable disk / CacheDirLevels 5 CacheDirLength 3 </IfModule> MaxKeepAliveRequests 10000
  • Benchmark 1 • 1,000 clients • 8KB response • 100% cache hit • Keep-alive on • 100K rps rate limited
  • • Squid used the most CPU and the worst median latency • 95th percentile latency with NiGNX, Squid and httpd 0 500 1000 1500 2000 2500 ATS NGiNX Squid Varnish httpd RPS / CPU Usage 0 20000 40000 60000 80000 100000 120000 ATS NGiNX Squid Varnish httpd Requests Per Second 0 2 4 6 8 10 12 14 16 18 ATS NGiNX Squid Varnish httpd Latency Median 95th
  • Benchmark 2 • 1,000 clients • 8KB response • 100% cache hit • Keep-alive off
  • • Squid used the most CPU again • NGiNX had latency issues • ATS most throughput 0 500 1000 1500 2000 2500 ATS NGiNX Squid Varnish httpd RPS / CPU Usage 0 5000 10000 15000 20000 25000 30000 ATS NGiNX Squid Varnish httpd Requests Per Second 0 5 10 15 20 25 30 35 40 ATS NGiNX Squid Varnish httpd Latency Median 95th
  • ATS • Pros – Scales well automatically, little config needed – Best cache implementation • Cons – Too many config files – Too many options in the default config files
  • NGiNX • Pros – Lots of plugins – FastCGI support • Cons – HTTP/1.1 compliance – Latency issues around accepting new connections – Rebuild server for new plugins
  • Squid • Pros – Best HTTP/1.1 compliance • Cons – Memory index for cache using 10x that of ATS – Least efficient with CPU – Worst median latency for keep-alive benchmarks
  • Varnish • Pros – VCL (Varnish Configuration Language) • Can do a lot without writing plugins • Cons – Thread per connection – mmap for cache • Persistence is experimental – No SSL or SPDY support
  • Apache httpd • Pros – Lots of plugins – Most used http server – Best 95th percentile latency for non-keep-alive • Cons – SPDY Support
  • Why ATS? • Scales well – CPU Usage, auto config • Cache scales well – Efficient memory index, minimizes seeks • Apache Community • Plugin support – Easy to port existing plugins over
  • References • ATS - http://trafficserver.apache.org/ • NGiNX - http://nginx.org/ • Squid - http://www.squid-cache.org/ • Varnish - https://www.varnish-cache.org/ • Apache httpd - http://httpd.apache.org/