• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Apache Traffic Server
 

Apache Traffic Server

on

  • 18,863 views

Overview of Apache Traffic Server

Overview of Apache Traffic Server

Statistics

Views

Total Views
18,863
Views on SlideShare
18,808
Embed Views
55

Actions

Likes
14
Downloads
291
Comments
0

8 Embeds 55

http://blog.sahsu.mobi 31
http://tspace.web.att.com 10
http://b.hatena.ne.jp 4
http://www.linkedin.com 4
http://www.docshut.com 2
http://cwiki.apache.org 2
http://www.christianreber.com 1
https://twitter.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • In other words: It is finished when either you don’t need me or someone else is doing it. Also, turn work that is laborious non-lousy
  • Not to say that those companies are using it (I don’t know if they are), but they employ committer to the project
  • Try to use the latest and greatest in dev until your first release. As it will front your site and likely handle you all of your incoming traffic it is a critical part of your infrastructure and you want the latest features and bug fixes. Due to its criticality, It has been my experience that people are reluctant to upgrade once it is live as it “just works”.
  • For storage, I say “if” you’re using caching. Understand that you really might not need to do this. In Y! US News we run it in proxy mode only and have had no traffic issues as of late. For the US Elections of 2008, we had a small “blip”, while other news and media organizations went down. Our “blip” was caused by keep-alives that were set too high unnecessarily. We adjusted this and were fine afterwards.
  • A client browser sends an HTTP request addressed to a host called www.host.com on port 80. Traffic Server receives the request because it is acting as the origin server (the origin server’s advertised hostname resolves to Traffic Server). Traffic Server locates a map rule in the remap.config file and remaps the request to the specified origin server (realhost.com). Traffic Server opens an HTTP connection to the origin server. If the request is a cache hit and the content is fresh, then Traffic Server sends the requested object to the client from the cache. Otherwise, Traffic Server obtains the requested object from the origin server, sends the object to the client, and saves a copy in its cache.
  • http://trafficserver.apache.org/docs/v2/admin/cache.htm
  • http://trafficserver.apache.org/docs/v2/admin/http.htm
  • This table is much to large to go into details, but it shows that there are a number of features to take into consideration when choosing an intermediary. This is not a complete list in any way, it is merely an example of what features you might want to consider for your proxy choices.
  • http://en.wikipedia.org/wiki/Edge_Side_Includes http://www.w3.org/TR/esi-lang
  • http://www.w3.org/TR/esi-lang
  • http://trafficserver.apache.org/docs/v2/admin/ http://ostatic.com/blog/guest-post-yahoos-cloud-team-open-sources-traffic-server https://cwiki.apache.org/confluence/display/TS/RoadMap http://www.oscon.com/oscon2010/public/schedule/detail/13878

Apache Traffic Server Apache Traffic Server Presentation Transcript

  • Increase uptime and performance with Apache Traffic Server Tom Melendez, Yahoo!
  • About Tom
    • Did Infrastructure development at Y! News for 3 years; have since moved on to do non-infra Hadoop stuff
    • General philosophy: My job isn’t done until I’m not doing it anymore
    • What I’ve learned: If Devs want to be happy, make Ops happy. Making Ops happy will ultimately make the business happy.
    • Primary motivation in life: I only do the fun stuff.
  • Agenda
    • What
    • Who
    • Where and When: The Internet. Now.
    • Why/How
  • What
    • Very efficient caching proxy.
    • Multi-threaded, event driven
    • I’ve seen it handle 30k rps per box (quad-core) without issue
    • I’ve seen lab data quoting 300k+ rps per box
    • Delivers 400 terabytes of data per day at Y!
    • Probably a configuration that can help your app.
  • Who (plus some history)
    • Open-Sourced by Yahoo! in November 2009
      • Yahoo! received it with Inktomi acquistion
      • Yahoo! currently has a team of engs on it
      • We use it all over the place
    • Apache TLP
      • Core committers from Yahoo!, Google, Akamai and elsewhere
  • Where and When
    • The internet.
    • Now.
  • Why?
    • You want to build a CDN
    • You want a reverse proxy
    • You want a forward proxy
    • You want good response for your worldwide customers
    • If you’re like me, you don’t want to do a lot of work. 
  • How? - Requirements
    • Well, you need to be able to compile the source. (So, Linux, OSX or Solaris)
    • There are also instructions for installing on EC2 as well as AMIs available
  • How - installation
    • ./configure && make && make install
      • Depending on your time to market, please consider working with the developer release, otherwise use stable
    • Start it up
      • sudo /usr/local/trafficserver start
    • But you’ll probably want to configure it first. 
  • How - executables
    • Processes:
    • traffic_server – handles the transactions
    • traffic_manager – manages the traffic server
    • traffic_cop – health check for the above
    • Tools:
    • traffic_line – gets stats and (re)read config
  • How – the config files
    • bypass.config - static bypass rules that Traffic Server uses in transparent proxy caching mode.
    • cache.config - how Traffic Server caches web objects.
    • congestion.config - enables you to configure Traffic Server to stop forwarding HTTP requests to origin servers when they become congested,
    • filter.config - enables you to deny or allow particular requests and strip header information from client requests.
    • hosting.config - assign cache partitions to specific origin servers
    • icp.config - defines ICP peers (parent and sibling caches).
    • ip_allow.config - controls client access to the Traffic Server proxy cache.
  • How – the config files
    • logs.config - formats traditional custom transaction log files.
    • log_hosts.config - log transactions for different origin servers in sep. files
    • logs_xml.config - defines the custom log file formats, filters, and processing options.
    • parent.config - identifies the parent proxies
    • partition.config - enables you to manage your cache space by creating partitions
    • records.config - list of configurable variables used by the Traffic Server software.
  • How – the config files
    • remap.config - contains mapping rules that Traffic Server uses to perform various actions.
    • splitdns.config - file enables you to specify the DNS server that Traffic Server should use for resolving hosts under specific conditions.
    • ssl_multicert.config - configure Traffic Server to use multiple SSL server certificates with the SSL termination option.
    • storage.config - lists all the files, directories, and/or hard disk partitions that make up the Traffic Server cache.
    • update.config - how Traffic Server performs a scheduled update of specific local cache content.
  • How - configuration
    • There are a lot of configuration files
    • I know what you must be thinking….
  • This is too complicated.
  • But remember…
    • I’m really lazy.
  • How - configuration
    • Never fear. You primarily deal with this one (routing rules): remap.config
    • You’ll tune this one when you’re first setting it up (system vars): records.config
    • If you’re using caching, you’ll set up this one: storage.config
  • How – config – Reverse Proxy
    • ATS acts as the origin server
    • IMO, this is the major use case
    • ATS can front one more more websites
  • How – config – Reverse Proxy
    • records.config:
    • CONFIG proxy.config.proxy_name STRING <hostname>
    • CONFIG proxy.config.http.server_port INT 80
    • CONFIG proxy.config.reverse_proxy.enabled INT 1
    • Depending on what you’re in front of:
    • CONFIG proxy.config.http.keep_alive_no_activity_timeout_in INT 60
    • CONFIG proxy.config.http.keep_alive_no_activity_timeout_out INT 1
    • CONFIG proxy.config.http.transaction_no_activity_timeout_in INT 15
    • CONFIG proxy.config.http.transaction_no_activity_timeout_out INT 30
    • CONFIG proxy.config.cache.ram_cache.size LLONG 1332735284
  • How – config – Reverse Proxy
    • storage.config
    • /dev/raw_sdb 1078000000
    • NOTE: To use RAW disk you must use the '/usr/bin/raw' program to bind a raw device to a an existing block device.
  • How – config – Reverse Proxy
    • remap.config
    • map http://www.example.com http://realsite.example.com
    • Now, reload traffic server:
    • sudo /usr/local/bin/traffic_line -x
  • How - quick notes about cache
    • You can purge objects from the cache with PURGE curl -X PURGE -v http://example.com/remove_me.jpg
    • There is a UI to inspect and modify the cache. You can enable it with: CONFIG proxy.config.http_ui_enabled INT 1
    • (I came to learn very recently that the UI won’t be supported, so if you really want it, make a case for it)
  • How – config – General Proxy
    • Forward Proxy
    • Your FEs make requests to APIs (yours and partners/external)
    • You don’t need to keep a sep proxy technology around (i.e. Squid), ATS does this too.
  • How – config – General Proxy
    • You can do all the cool cache stuff
      • configuration for computation for object freshness
      • Push content into the cache
      • Pin contents in the cache for specific periods
      • Ignore no-cache headers
    • Translation of Squid directives to ATS: https://cwiki.apache.org/confluence/display/TS/SquidConfigTranslation
  • But, but, but….
    • “ Well, I don’t expect lots of traffic”
    • “ I already have VIPs/Load Balancers”
    • “ Do I really need a proxy server?”
    • “ Is this yet another thing I need Ops for?”
  • How – Usage
    • Even if you can handle your traffic:
      • change origin servers without changing DNS
      • redirect different URL paths, or rewrite URLs altogether
      • send traffic cross-colo during upgrades, for testing, etc.
      • use minimal server installations across the world, just put ATS there and have DNS resolution point there.
  • Extending How - Plugins
    • You can extend the functionality of ATS with plugins
    • There are plugins available (forthcoming) by Yahoo!
    • You can build your own plugins and there is lots of documentation available.
    • You are STRONGLY encouraged to contribute back!
  • Extending How - Plugins
  • Example functionality of plugins
    • Modify the response
    • Do a db lookup based on a header
    • append or remove query string params or headers before they are sent to the origin server
    • Respect custom headers
    • More advanced URL rewriting (more than regex)
  • Ok, nothing is that easy
    • Obviously, create the remap.config and rules
    • You may want to create your own plugins (probably not)
    • Code changes
      • If you plan on using the cache, your cache control headers should be configurable (you should be doing this anyway)
    • Process changes
      • Ops should own YTS, dev might need to set it up
      • QA needs to have YTS in their env; devs should have one too
    • Ops changes
      • If you don’t have monitoring, this would be a good time to add it.
  • Competitors
    • There are plenty of Open Source competitors
    • Not easy to make an apples-to-apples comparison
    • I don’t have any benchmarks
    • Look for the features you want and an active community.
  • Competitors: A stolen slide, probably not up-to-date! ATS HAproxy nginx Squid Varnish mod_proxy Worker Threads Y N N N Y Y Multi-Process N Y Y N Y Y Event-driven Y Y Y Y N N? Plugin APIs Y N Y part Y Y Forward Proxy Y N N Y N Y Reverse Proxy Y Y Y Y Y Y Transp. Proxy Y Y N Y N N Load Balancer part Y Y Y Y Y Cache Y N Y Y Y Y ESI soon N N Y Y N ICP Y N N Y N N Keep-Alive Y N Y Y Y Y SSL Y N Y Y N Y
  • What is ESI?
    • “ Edge Side Includes”
    • Think SSI on the “Edge”
    • Example in your markup (taken from Wikipedia)
      • <esi:include src=&quot;http://example.com/1.html&quot; alt=&quot;http://bak.example.com/2.html&quot; onerror=&quot;continue&quot;/>
    • (Something to think about – ESI will change your development processes, testing, etc.)
  • ESI
    • Another Example
    • <esi:try>
    • <esi:attempt>
    • <esi:comment text=&quot;Include an ad&quot;/>
    • <esi:include src=&quot;http://www.example.com/ad1.html&quot;/>
    • </esi:attempt>
    • <esi:except>
    • <esi:comment text=&quot;Just write some HTML instead&quot;/>
    • <a href=www.akamai.com>www.example.com</a>
    • </esi:except>
    • </esi:try>
    • Read the spec at: http://www.w3.org/TR/esi-lang
  • Questions?
    • http://trafficserver.apache.org/docs/v2/admin/
    • http://ostatic.com/blog/guest-post-yahoos-cloud-team-open-sources-traffic-server
    • https://cwiki.apache.org/confluence/display/TS/RoadMap
    • http://www.oscon.com/oscon2010/public/schedule/detail/13878
  • Questions for the Audience (you!)
    • I have some of my own:
      • Developing in something other than PHP?
      • Anyone moving away from PHP? If so, why?
      • Big challenges you’re facing right now?
      • Frameworks/CMSes – not a lot of talk about them this year
      • What technology changes are you investing in?
      • Are you doing unit/functional/perf testing regularly?