Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Reflecting a year after migrating to apache traffic server

28,326 views

Published on

LinkedIn started May 2003, I started August 2011. Over 8 years of cruft and confusion piled up before we even considered moving to Apache Traffic Server. This talk will focus on the journey and what we learned along the way:
* What LinkedIn is doing with ATS to affect change across the entire stack with a infrastructure tier
* Building automation and tooling
* Bizarre scenarios how users are querying the site
* Metrics and monitoring
* Patches we contributed

Published in: Technology
  • DOWNLOAD THIS BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download Full EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download Full doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download PDF EBOOK here { https://soo.gd/irt2 } ......................................................................................................................... Download EPUB Ebook here { https://soo.gd/irt2 } ......................................................................................................................... Download doc Ebook here { https://soo.gd/irt2 } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THIS can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THIS is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THIS Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THIS the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THIS Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Dating for everyone is here: ♥♥♥ http://bit.ly/39sFWPG ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Follow the link, new dating source: ♥♥♥ http://bit.ly/39sFWPG ♥♥♥
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • https://www.pornhub.com/view_video.php?viewkey=ph59ce8eea7739e no com en ts whats wrong here,.!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • https://welcome.unibet.com/no/pop/sportsbook/general/index.html?mktid=1:81750246:27546576-23543 no com in Oslo.. What the hell is the point if you dont hawe aney girls for men then go avay from tubes like this.. Sex is the point with these tubes..
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Reflecting a year after migrating to apache traffic server

  1. 1. ©2013 LinkedIn Corporation. All Rights Reserved.Reflecting a Year After Migrating to Apache Traffic Server
  2. 2. ©2013 LinkedIn Corporation. All Rights Reserved.Have You Looked At Your Access Logs Lately?
  3. 3. ©2013 LinkedIn Corporation. All Rights Reserved.Surviving by Proxy
  4. 4. ©2013 LinkedIn Corporation. All Rights Reserved.Even Your Registrar Breaks Sometimes
  5. 5. ©2013 LinkedIn Corporation. All Rights Reserved.How Apache Traffic Server Changed LinkedIn
  6. 6. ©2013 LinkedIn Corporation. All Rights Reserved.Hello!
  7. 7. ©2013 LinkedIn Corporation. All Rights Reserved.ATS: Apache Traffic Server Fast, scalable and extensible HTTP/1.1 compliant caching proxy server Single-process, multi-threaded Asynchronous I/O Plugin architecture Written by Inktomi >10 years ago, Yahoo acquired Inktomi, found the codeon a system collecting dust in a cardboard box, and open-sourced in 2010
  8. 8. ©2013 LinkedIn Corporation. All Rights Reserved.ATS: Who’s using it?
  9. 9. ©2013 LinkedIn Corporation. All Rights Reserved.When we started… 4,000 QPS to www.linkedin.com 120M members Citrix NetScaler used for all external load balancing (XLB)– Load balances requests based on path to frontends– SSL termination– Monitors health per frontend Features were built as Tomcat filters– Tomcat required, no solution for alternates– >70 frontend services deployed across hundreds of hosts
  10. 10. ©2013 LinkedIn Corporation. All Rights Reserved.Outgrowing the existing solution Need to support multiple frontend frameworks– DoS protection– Authentication– Optimizations Complete control over features– Cookie manipulation– Advanced routing Deployment delays, security-related fixes took days if not weeks Even small changes required touching network gear
  11. 11. ©2013 LinkedIn Corporation. All Rights Reserved.How about an intelligent HTTP proxy layer? Less (re)implementing features into multiple frameworks Make decisions higher in the stack– Faster response time– Reduce work on the application stack Rapid iteration
  12. 12. ©2013 LinkedIn Corporation. All Rights Reserved.Where to start? Evaluated options Requirements:– Mature– Scalable– Language we like– Plugin support with hooks and documentation, shared libraries a big plus– Shared runtime information between plugins– In-house knowledge is a plus Apache Traffic Server matched our needs
  13. 13. ©2013 LinkedIn Corporation. All Rights Reserved.Preparation 4 patches out of the gate Audit traffic, build configs Build metrics, dashboards and alerts– Huge blocker, new territory for non-Java @ LinkedIn Migrate traffic, one service at a time
  14. 14. ©2013 LinkedIn Corporation. All Rights Reserved.Let’s migrate!Started migration in October 2011
  15. 15. ©2013 LinkedIn Corporation. All Rights Reserved.Let’s migrate!Started migration in October 2011“We’ll be done by Christmas!”- everyone
  16. 16. ©2013 LinkedIn Corporation. All Rights Reserved.Original PlanXLBL1 Proxy(ATS)VIPFrontend
  17. 17. ©2013 LinkedIn Corporation. All Rights Reserved.Month 1: Public Profile
  18. 18. ©2013 LinkedIn Corporation. All Rights Reserved.Request Rules RemapCookie-based routinge.g. logged-in vs. logged-outLos AngelesXLBL1 Proxy(ATS)VIPFrontendChicagoXLBL1 Proxy(ATS)VIPFrontendwww.linkedin.com
  19. 19. ©2013 LinkedIn Corporation. All Rights Reserved.Request Rules Remapif (request_cookie[”foo"] starts_with ”bar”)return "host:chicago.linkedin.com:8888";elsereturn "host:losangeles.linkedin.com:8888";
  20. 20. ©2013 LinkedIn Corporation. All Rights Reserved.Month 3: Sentinel (DoS protection)Prevent abusive requests from reaching frontendXLBL1 Proxy(ATS)VIPFrontend
  21. 21. ©2013 LinkedIn Corporation. All Rights Reserved.Month 4: Picking up momentumLargest frontends of the site done– Homepage– Profile– RegistrationNew ATS tier, Fizzy!
  22. 22. ©2013 LinkedIn Corporation. All Rights Reserved.New ATS tier, Fizzy! Edge Side Includes on steroids UI content aggregator Progressive Rendering– Browser deferred rendering– Browser deferred fetch– Server Supports Server Side Renderingof JavaScript templates via V8
  23. 23. 1342
  24. 24. ©2013 LinkedIn Corporation. All Rights Reserved.Now with Fizzy!XLBL1 Proxy(ATS)VIP VIPFrontend(non-fizzy)Fizzy(ATS)VIPFrontend
  25. 25. ©2013 LinkedIn Corporation. All Rights Reserved.Month 6: Most frontends migrated Config generators written Caught the attention of other teams– New plugins developed Another new tier, QD Proxy!
  26. 26. ©2013 LinkedIn Corporation. All Rights Reserved.Another new tier, QD Proxy!Quick Deploy Proxy– Define profiles for dev instances to route to– Allows multiple users to use the same profile– Develop without running the entire stack
  27. 27. ©2013 LinkedIn Corporation. All Rights Reserved.Quick Deploy Proxy: FrontendXLBL1 Proxy(ATS)FrontendFizzy(ATS)BackendQD Proxy(ATS)MyFrontend
  28. 28. ©2013 LinkedIn Corporation. All Rights Reserved.Quick Deploy Proxy: BackendXLBL1 Proxy(ATS)FrontendFizzy(ATS)BackendQD Proxy(ATS)MyBackend
  29. 29. ©2013 LinkedIn Corporation. All Rights Reserved.Month 9: Ramping Fizzy to 100%
  30. 30. ©2013 LinkedIn Corporation. All Rights Reserved.Month 9: Ramping Fizzy to 100% Broke the site
  31. 31. ©2013 LinkedIn Corporation. All Rights Reserved.Month 9: Ramping Fizzy to 100% Broke the site HA Proxy saves the day– “The Reliable, High Performance TCP/HTTP Load Balancer”– leverage the metadata in Range to generate configs– reduce network hops by avoiding hardware load balancer– deploy changes in minutes
  32. 32. ©2013 LinkedIn Corporation. All Rights Reserved.… and HA Proxy!XLBFrontendL1PROXY HAPROXYATSFIZZYHAPROXYATSL1PROXY HAPROXYATSFIZZYHAPROXYATSFrontend(non-fizzy)
  33. 33. ©2013 LinkedIn Corporation. All Rights Reserved.After all that… October 2011: 4,000 QPS, 120M members August 2012: 15,000 QPS, 175M members Now: 67,000 QPS, 225M members Citrix NetScaler still in use– Load balancing L1 proxy– SSL termination Features built as ATS plugins– Supports anything behind ATS tiers (L1 Proxy, Fizzy)– Quick to deploy
  34. 34. ©2013 LinkedIn Corporation. All Rights Reserved.Implementation October 2011 - August 2012 (10 months)
  35. 35. ©2013 LinkedIn Corporation. All Rights Reserved.Implementation October 2011 - August 2012 Unexpected surprises aka outages Scope creep– New tiers and architecture: Fizzy, HA Proxy– Lots of new plugins It takes time to build…– monitoring– tooling– configuration automation
  36. 36. ©2013 LinkedIn Corporation. All Rights Reserved.Outages Hand edited configs with typos Misbehaving node in rotation Bad upgrade from 2.x to 3.x due to incompatible hostdb Missing slash for a config, sent requests to wrong frontend Bonus slash to a healthcheck taking all hosts down SysOps re-imaged experimental hosts, broke 10% of Profile Saturated load balancer due to additional ATS layer Sticky cookie conflict between frontends HA Proxy wasn’t started Random ATS crashes Coal in our stocking for Christmas Multiple issues with multiple plugins Log4cpp hard-coded to DEBUG at root level for one plugin, overwrote for all plugins FD per-user limit unexpectedly changed Keep-alive unexpectedly turned on with high timeouts
  37. 37. ©2013 LinkedIn Corporation. All Rights Reserved.Outages (>0.1% requests affected)0% 20% 40% 60% 80% 100%201120122013Plugin ATS Human
  38. 38. ©2013 LinkedIn Corporation. All Rights Reserved.How did we improve?
  39. 39. ©2013 LinkedIn Corporation. All Rights Reserved.How did we improve? Monitoring!
  40. 40. ©2013 LinkedIn Corporation. All Rights Reserved.Monitoring: traffic_logstats• per-origin breakdown:– status– method– QPS– bytes– etc.• Want JSON output? use -j• results are COUNTER, and GAUGE if the key ends in _pct
  41. 41. ©2013 LinkedIn Corporation. All Rights Reserved.Monitoring: traffic_logstatsHTTP return codes Count Percent Bytes Percent------------------------------------------------------------------------------100 Continue 0 0.00% 0.00KB 0.00%200 OK 1,383,361 93.57% 4.71GB 97.48%201 Created 5,429 0.37% 3.28MB 0.07%202 Accepted 0 0.00% 0.00KB 0.00%203 Non-Authoritative Info 0 0.00% 0.00KB 0.00%204 No content 12 0.00% 5.63KB 0.00%205 Reset Content 0 0.00% 0.00KB 0.00%206 Partial content 0 0.00% 0.00KB 0.00%2xx Total 1,388,802 93.94% 4.71GB 97.54%300 Multiple Choices 0 0.00% 0.00KB 0.00%301 Moved permanently 3,360 0.23% 3.47MB 0.07%302 Found 38,475 2.60% 35.09MB 0.71%303 See Other 11 0.00% 3.87KB 0.00%304 Not modified 29,262 1.98% 12.20MB 0.25%305 Use Proxy 0 0.00% 0.00KB 0.00%307 Temporary Redirect 0 0.00% 0.00KB 0.00%3xx Total 71,108 4.81% 50.76MB 1.03%...
  42. 42. ©2013 LinkedIn Corporation. All Rights Reserved.Monitoring: traffic_line• Swiss army knife for Traffic Server• executable to read variables
  43. 43. ©2013 LinkedIn Corporation. All Rights Reserved.Monitoring: {stat}• prefer HTTP over shell?records.config:CONFIG proxy.config.http_ui_enabled INT 2remap.config:map /_stat/ http://{stat} @action=allow @src_ip=127.0.0.1
  44. 44. ©2013 LinkedIn Corporation. All Rights Reserved.Monitoring: {stat}proxy.node.restarts.manager.start_timeproxy.node.restarts.proxy.start_time
  45. 45. ©2013 LinkedIn Corporation. All Rights Reserved.Monitoring: {stat}proxy.node.restarts.manager.start_timeproxy.node.restarts.proxy.start_time
  46. 46. ©2013 LinkedIn Corporation. All Rights Reserved.Monitoring: {stat}proxy.node.current_client_connectionsproxy.node.current_server_connections
  47. 47. ©2013 LinkedIn Corporation. All Rights Reserved.Monitoring: {stat}proxy.config.net.connections_throttle limit before ATS starts to drop connections based on the sum of client and server connectionsproxy.process.net.connections_currently_open client + server connections
  48. 48. ©2013 LinkedIn Corporation. All Rights Reserved.Monitoring: {stat}proxy.config.net.connections_throttle limit before ATS starts to drop connections based on the sum of client and server connectionsproxy.process.net.connections_currently_open client + server connections
  49. 49. ©2013 LinkedIn Corporation. All Rights Reserved.Monitoring: {stat}Plugin specific reviewed prior plugins go to productionExamples enforced vs. un-enforced DoS requests track cookie usage for a migration thread usage of a plugin
  50. 50. ©2013 LinkedIn Corporation. All Rights Reserved.Monitoring: outside the appCore dump rate– generate crash reports with full stack trace– monitoring file system for core dumps newer than -24 hours– alert if > NTCP– capture states from netstat– listen queue overflowing (net.core.somaxconn)Proc– review /proc/pid/status– fetch VmSize and VmSwap– count # of files in /proc/pid/fd for FD usage
  51. 51. ©2013 LinkedIn Corporation. All Rights Reserved.Monitoring: logs I HATE dislike the stock logs squid.log– mimics squid access log– more useful if you’re caching common.log, extended.log, extended2.log– Netscape formats– not enough detail custom logging!
  52. 52. ©2013 LinkedIn Corporation. All Rights Reserved.Custom Loggingrecords.configCONFIG proxy.config.log.custom_logs_enabled INT 1logs_xml.config<LogFormat><Name = ”custom_access"/><Format = "%<chi> %<{X-Real-Client-IP}cqh> - %<caun> [%<cqtn>] "%<cqhm> %<cquuc>%<cqhv>" %<pssc> %<pscl> "%<{Referer}cqh>" "%<{User-Agent}cqh>" %<ttms>ms%<cquc> %<{X-LI-UUID}psh>"/></LogFormat><LogObject><Format = ” custom_access"/><Filename = ”access"/></LogObject>
  53. 53. ©2013 LinkedIn Corporation. All Rights Reserved.Custom logging%<chi> 172.16.200.10%<{X-Real-Client-IP}cqh> 65.16.225.8%<caun> - (http authd username)[%<cqtn>] [01/Nov/2011:23:59:59 +0000]"%<cqhm> %<cquuc> %<cqhv>" "GET /nhome/ HTTP/1.1"%<pssc> 200%<pscl> 34697%<{Referer}cqh> “http://www.linkedin.com/"%<{User-Agent}cqh> "Mozilla/4.0 (compatible; ...)"%<ttms> 327ms%<cqu> http://origin:port/nhome/
  54. 54. ©2013 LinkedIn Corporation. All Rights Reserved.Dashboard: overviewInternal ATS: client connections server connections traffic_cop uptime traffic_server uptime connection failed invalid requestLogs: 2xx status 3xx status 4xx status 5xx status HTTP methodsOS: cpu usage interface tcp state distribution # of core dumps ATS memory usage ATS swap usage ATS file descriptor usage
  55. 55. ©2013 LinkedIn Corporation. All Rights Reserved.Dashboard: in-depth plugin-specific per-path histogram of request durations per-origin HTTP status breakdown HA Proxy– current sessions– denied requests– error requests– server status
  56. 56. ©2013 LinkedIn Corporation. All Rights Reserved.How did we improve? Automation!Configs are generated, not hand maintained– Details about a service are stored in metadata store– YAML configs supplement missing dataDeployment done by Salt– All deployment actions and verifications are– integrated with Informed
  57. 57. ©2013 LinkedIn Corporation. All Rights Reserved.Informed
  58. 58. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins!header-rewriterequest-rules-remapsentinellix-remaphost_overridepostbuffermobileredirectcorrectcookiedomainqdproxyboompagespeedcontentsecurityheaderauthfilteroauth-rewritestickyrouting
  59. 59. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: header-rewriteManipulate headers at any point in the request lifecycle– read request– send request– read response– send response Can use as a remap plugin– change path, destination, port Patched to include variables
  60. 60. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: header-rewritecond %{READ_REQUEST_HDR_HOOK} [AND]cond %{ACCESS:/var/healthcheck} [NOT]rm-header Connectionadd-header Connection "close”
  61. 61. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: header-rewritecond %{SEND_RESPONSE_HDR_HOOK} [AND]cond %{PATH} "/foo.js”add-header Content-Type "text/javascript”
  62. 62. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: lix-remapUses LinkedIn Experiments infrastructure (A/B testing) to make routingdecisions Enable NOC to easily send traffic to another data center Route specific users, LinkedIn employees or % of users to experimentaltiers Used for red-line performance testing of frontends
  63. 63. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: BoomWe don’t want to show users this…
  64. 64. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: Boom… but based on status code, we can replace it with this:
  65. 65. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: Host OverrideDirect your request to a specific host through any ATS tier
  66. 66. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: PageSpeedSupport on-the-fly operations before sending the response
  67. 67. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: PageSpeedHTML minification– How many empty new lines are on Profile?
  68. 68. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: PageSpeedHTML minification– How many empty new lines are on Profile?2703
  69. 69. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: PageSpeedHTML minification– How many empty new lines are on Profile?2703– How many empty new lines are on Homepage?
  70. 70. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: PageSpeedHTML minification– How many empty new lines are on Profile?2703– How many empty new lines are on Homepage?9205
  71. 71. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: PageSpeedHTML minificationHomepage: 78%Profile: 72%
  72. 72. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: PageSpeedHTML minificationHomepage: 10%Profile: 17%010000200003000040000Homepage ProfileCompressed bytes
  73. 73. ©2013 LinkedIn Corporation. All Rights Reserved.Plugins: PageSpeedLazy loading of images below the fold
  74. 74. ©2013 LinkedIn Corporation. All Rights Reserved.The awesome patches
  75. 75. ©2013 LinkedIn Corporation. All Rights Reserved.The awesome patches traffic_server gets restarted if FD > 32
  76. 76. ©2013 LinkedIn Corporation. All Rights Reserved.The awesome patches traffic_server gets restarted if FD > 32 infinite emergency throttle
  77. 77. ©2013 LinkedIn Corporation. All Rights Reserved.The awesome patches traffic_server gets restarted if FD > 32 infinite emergency throttle buffer overflow in the stats system
  78. 78. ©2013 LinkedIn Corporation. All Rights Reserved.Contributions back28 fixes committed back to open-source19 more pendingLinkedIn ATS committer, Brian Geffon
  79. 79. ©2013 LinkedIn Corporation. All Rights Reserved.ATS C++ APISimplifies the process of writing ATS pluginshttps://github.com/linkedin/atscppapiI wrote a transformation plugin that would probablyhave taken me weeks, struggling with virtual I/Obuffers, in just a few hours. Now that I’ve done itonce, it would be even faster.Doug YoungSr. Staff Software Engineer
  80. 80. ©2013 LinkedIn Corporation. All Rights Reserved.Almost forgot… Media Cache!Serves profile pictures, cached external contentPre-ATS– NetApp filer CPU >50%– Expected an outage during NetApp failover
  81. 81. ©2013 LinkedIn Corporation. All Rights Reserved.Almost forgot… Media Cache!Serves profile pictures, cached external contentPre-ATS– NetApp filer CPU >50%– Expected an outage during NetApp failoverPost-ATS– 98% cache hit rate– $30,000 in gear, saved $400,000– Bought us time to re-architect the service
  82. 82. ©2013 LinkedIn Corporation. All Rights Reserved.So what are the takeaways? ATS is a bad ass HTTP proxy Small details matter, fight for the users HA Proxy is a silver bullet Slow down, learn for your mistakes. Don’t just use open-source, contribute
  83. 83. ©2013 LinkedIn Corporation. All Rights Reserved.Meet the teamManjesh Nilange Brian Geffon Thomas JacksonNick BerryOffice hours @ 1:15 PMExhibit Hall (Table 2)
  84. 84. ©2013 LinkedIn Corporation. All Rights Reserved.Links This talk: Apache Traffic Server:– http://trafficserver.apache.org ATS C++ API:– https://github.com/linkedin/atscppapi New plugins:– https://github.com/linkedin/ -- coming soon!
  85. 85. ©2013 LinkedIn Corporation. All Rights Reserved.Goodbye!

×