HAProxy tech talk


Published on

Published in: Technology
1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

HAProxy tech talk

  1. 1. I ❤ HAProxy
  2. 2. National Airspace System - FAA
  3. 3. Simplified Web Architecture WebClients Dynamic “Data” Server PHP Memcache iPhones Nginx PostgreSQL Ruby MySQL Androids Apache Perl Mongo CouchDB Python Redis Browsers lighttpd Node.js Oracle
  4. 4. ChOP Archtiecture WebClients Dynamic “Data” Server Memcache iPhone MySQL Android Nginx PHP5-FPM Redis Desktop Chat
  5. 5. YouVersion Architecture WebClients Dynamic “Data” Server Memcache iPhone PHP5-FPM PostgreSQL Android Nginx Mongo Ruby (coming Desktop soon) Oracle
  6. 6. HAProxy¡  High Availability Proxy¡  TCP load balancing proxy with awesome health checking built in¡  Fast¡  Scalable¡  Makes non-HA services HA
  7. 7. How I Love Thee, Let MeCount The Ways…¡  Rock solid¡  Dead simple to run and configure¡  Comprehensive Health Checking¡  Lots of statistics
  8. 8. HAProxy Uses¡  Not really a service unto itself¡  Fits into the gaps between layers well¡  Issue: Becomes a single point of failure itself HAProxy HAProxy* HAProxy* Web DynamicClients “Data” Server Engine * – potential future use
  9. 9. Eliminating SPOFs¡  Two types of HAProxy SPOFs: ¡  Service Outage (Hardware failure or HAProxy service failure) ¡  HAProxy Limit Outage / Upstream Outage (Hit some arbitrary limit we defined somewhere or ran out of some slots somewhere)
  10. 10. Service Outage¡  HAProxy service crashes or dies for some reason (has never happened, knock on wood)¡  Hardware / Network Failure
  11. 11. Service Outage: Solution¡  Corosync & Pacemaker¡  Hard to configure at first, but don’t really need to touch it later¡  Pretty much magic¡  Two Corosync HAProxy clusters: DFW and SAN¡  Setup is blogged about here: http://itand.me/41901523
  12. 12. HAProxy Limit Outage /Upstream Outage¡  Usually because of an outage further upstream at the Dynamic or “Data” layer¡  Completely Hypothetical Situation: Mongo slows down, causing PHP processes to back up, causing the connection limit to go through the roof, causing total outage
  13. 13. What it looks like on thegraph (Yesterday)OR: WHY WE MUST MOVE MONGO STAT!
  14. 14. For ChOP (Chat), it’s a littledifferent…
  15. 15. Upstream Outage¡  Usually the result of running out of PHP processes.¡  Normally each PHP process can process hundreds of req/s¡  Something slows them down (mongo, postgres, et al) so a process can only process a smaller number of req / s (or, worse, seconds / req)¡  Inevitably, these requests take all PHP processes, nothing else can run and HAProxy fails all health checks and shows you Binary Jesus
  16. 16. “Solutions”¡  Start Hashing URLs to avoid upstream failures ¡  Want to send all URL requests to the same app server so if it’s slow only that app server goes down ¡  Some benefit to caching as well ¡  Challenge: want to hash only part of a URL ¡  Challenge: need to separate app servers into “availability groups” ¡  Challenge: deployments, monitoring, alerting, all that crap…
  17. 17. HAProxy Limit Outage¡  We set limits on all HAProxy backends and front ends and servers to ensure they don’t get overwhelmed¡  Sometimes these limits are too low¡  Solution: Raise them¡  Challenge: Raise them too high without regard for the backend, and you could cause more harm than good (Stampeding Herd)
  18. 18. Q&A