CI_CONF 2012: Scaling - Chris Miller


Published on

Presentation by Chris Miller on scaling for CI_CONF 2012 San Francisco. August 12, 2012.

  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

CI_CONF 2012: Scaling - Chris Miller

  1. 1. Going Big: Scalability
  2. 2. Who am I?• Chris Miller• Huffington Post - Senior Developer• CMS platform and API• Started in systems/network admin before code
  3. 3. What is Huffington Post?• #87 most popular site in the world (Alexa)• #3 most popular news site in world (Alexa)• #19 most popular US site (Alexa)• More traffic than
  4. 4. Our Platform: Today• Everything! No, really.• Perl: CMS core• PHP “layer” integrated on top of Perl code• MySQL data storage• MongoDB for comments storage• Hadoop for internal statistical analysis• Memcache for lightweight caching• Redis for more structured data types• Varnish for caching!
  5. 5. Our Platform: Tomorrow• Re-think tools and platform from ground up• Building new API – Yes, OAuth 2.0! – Complete REST approach – Will be public!• We can’t re-write everything at once, so the API build has 4 phases: – Build “bridge” middleware to allow access to existing functionality – Refactor backend edit/admin tools – Refactor frontend to use API – Transparently, and calmly, refactor old code while maintaining API interfaces
  6. 6. So what about CI?• New API is built on CodeIgniter – Using Phil’s REST library as a starting point • Thanks Phil!• Backend editorial tools are being built on CI• We love CI – But it isn’t our only framework – Different tools work better for different teams – We use what works. You should too.
  7. 7. How we scale• CDN: Akamai • 80%+ hit rate • Amazon S3 for origin of static files• Basic page layout/content is generated to flat file • These contain some dynamic content, in PHP • By having the basic page as a flat file, its less overhead to load • It also means for certain changes, we have to "regenerate" the page. Ugh.
  8. 8. Varnish• HTTP caching reverse proxy (“HTTP Accelerator”)• Caching layer in front of your web server• Stores complete responses in memory• If request exists, serves from memory – Otherwise, forwards to web server, and then caches• Works nicely with Linux Kernel to delegate memory allocation and management to the OS, where it belongs
  9. 9. Controlling Varnish• Set custom TTLs for content:if (beresp.http.X-HP-Cache-Control ~ "s-maxage") { set beresp.http.X-HP-Cache-Control = regsub(beresp.http.X-HP-Cache-Control, "^.*s-maxage=([0-9]+).*", "1"); // set the ttl. C{ char *ttl; ttl = VRT_GetHdr(sp, HDR_BERESP, "023X-HP-Cache-Control:"); VRT_l_beresp_ttl(sp, atoi(ttl)); }C set beresp.http.X-Cacheable = "CUSTOM: " + beresp.ttl ;} elsif (beresp.http.X-HP-Cache-Control ~ "(no-cache|private)" || beresp.http.pragma ~ "no-cache") { set beresp.ttl = 0s; set beresp.http.X-Cacheable = "NO-CACHE";} else { set beresp.http.X-Cacheable = "DEFAULT: 30s"; set beresp.ttl = 30s;}
  10. 10. Controlling Varnish• Refreshing contentsub process_refresh_requests { if (req.request == "REFRESH") { set req.request = "GET"; set req.hash_always_miss = true; }}• This is invoked early in the vcl_recvvcl_recv method
  11. 11. Edge Side Includes• Include cached content blocks into pages<html><body><esi:include src="” alt="" onerror="continue”/></body></html>
  12. 12. Edge Side Includes• How to use ESI: – Make complicated blocks independently- accessible URIs – Create a “template” file with ESI includes to bring the page together• Why this is powerful – If multiple pages use different combinations of page components, some may already be cached – Reduces amount of times entire page must be served; Serve only components needed
  13. 13. Varnish Tricks• Intelligently purge the cache when your content changes – Allows you to increase TTL without fear of caching outdated content if (req.request == "PURGE") { if (!client.ip ~ purgers) { error 405 "Method not allowed"; } return (lookup); }
  14. 14. Other Scaling Tips• Hardware SSL offloading is your friend• Consider mod_php – CGI has huge overhead – CGI/SuExec has huge security advantages – FastCGI is a happy-medium for some
  15. 15. Other Scaling Tips• Don’t try to do everything on one server/cluster – Splitting your application is ok – 1 cluster for frontend, 1 server/cluster for backend, etc.• Keep an open mind about technologies, platforms, and tools
  16. 16. One More Thing… (sorry, I couldn’t resist)
  17. 17. Guilds!• What a guild is: – Groups of people around a topic – Membership/participating is encouraged, but not required – Think of it as an internal Meetup• Join to learn new things• Join to talk about things you are interested in• Examples: PHP, Front End, Python, Ruby, Management, Platform/Architecture, Big Data, etc…
  18. 18. Guilds!• Experts to solve technology-specific problems – Example: Front-end swat team to improve page load time due to slow/too much JS• Collectively give back to the community around your technology• Help others learn, and learn from others• Meet people on other teams
  19. 19. Guilds!• Try it out
  20. 20. ¿Preguntas?Questions?Perguntas?
  21. 21. Chris @ee99ee (P.S. – We’re hiring in NYC)