Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

1. Scaling PHP/MySQL...Presentation from Flickr

28,501 views

Published on

Published in: Technology
  • lamp
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Impressive presentation of 'Scaling PHP/MySQL'. You've shown your credibility on presentation with this slideshow. This one deserves thumbs up. I'm John, owner of www.freeringtones.ws/ . Hope to see more quality slides from you.

    Best wishes.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • hey there,could you please mail this across to me,it will truly assist me for my work.thank you really much.
    Anisa
    http://financejedi.com http://healthjedi.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • City Real Estate Europe http://www.cityorbestate.com


    عقار http://www.3qarsa.net

    حكايات نواعم http://www.nem-stories.com/vb/
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • really nice!
    http://www.myselfhypnosis.net/
    http://www.mindpowerspecialreport.com/
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

1. Scaling PHP/MySQL...Presentation from Flickr

  1. 1. Hardware Layouts for LAMP Installations John Allspaw, Flickr Plumbr Flickr (Yahoo) [email_address] October 18, 2005
  2. 2. Hardware Layouts for LAMP Installations <ul><li>Hardware requirements for LAMP installs have to do with: </li></ul><ul><li>A decent amount about the actual hardware (“in-box” stuff) </li></ul><ul><li>A bit more about the hardware architecture </li></ul><ul><li>Which should complement the application architecture </li></ul>
  3. 3. Hardware Layouts for LAMP Installations <ul><li>What we’ll talk about here: </li></ul><ul><li>Database (MySQL) layouts and considerations </li></ul><ul><li>Some miscellaneous/esoteric stuff (lessons learned) </li></ul><ul><li>Caching content and considerations </li></ul>
  4. 4. Hardware Layouts for LAMP Installations <ul><li>Growing Up, “One Box” solution </li></ul><ul><ul><li>Basic web application (discussion board, etc.) </li></ul></ul><ul><ul><li>Low traffic </li></ul></ul><ul><ul><li>Apache/PHP/MySQL on one machine </li></ul></ul><ul><ul><li>Bottlenecks will start showing up: </li></ul></ul><ul><ul><ul><li>Most likely database before apache/php </li></ul></ul></ul><ul><ul><ul><li>Disk I/O (Innodb) or locking wait states (MyISAM) </li></ul></ul></ul><ul><ul><ul><li>Context switching between memory work (apache) and CPU work (MySQL) </li></ul></ul></ul>
  5. 5. Hardware Layouts for LAMP Installations ONE BOX
  6. 6. Hardware Layouts for LAMP Installations <ul><li>Growing Up, “Two Box” solution </li></ul><ul><ul><li>Higher traffic application (more demand) </li></ul></ul><ul><ul><li>Apache/PHP on box A, MySQL on box B </li></ul></ul><ul><ul><li>Same network = bad (*or is it ?), separate network = good </li></ul></ul><ul><ul><li>Bottlenecks with start to be: </li></ul></ul><ul><ul><ul><li>Disk I/O on MySQL machine (Innodb) </li></ul></ul></ul><ul><ul><ul><li>Locking on MyISAM tables </li></ul></ul></ul><ul><ul><ul><li>Network I/O </li></ul></ul></ul>
  7. 7. Hardware Layouts for LAMP Installations TWO BOX
  8. 8. Hardware Layouts for LAMP Installations <ul><li>Growing Up, “Many Boxes with Replication” solution </li></ul><ul><ul><ul><li>Yet even higher traffic </li></ul></ul></ul><ul><ul><ul><li>Writes are separated from reads (master gets IN/UP/DEL, slaves get SELECTs) </li></ul></ul></ul><ul><ul><ul><li>Diminishes network bottlenecks, disk I/O, and other “in-box” issues </li></ul></ul></ul><ul><ul><ul><li>SELECTs, IN/UP/DEL can be specified within the application, </li></ul></ul></ul><ul><ul><ul><li>OR…. </li></ul></ul></ul><ul><ul><ul><li>Load-balancing can be used </li></ul></ul></ul>
  9. 9. Hardware Layouts for LAMP Installations MANY BOX
  10. 10. Hardware Layouts for LAMP Installations <ul><li>Slave Lag </li></ul><ul><ul><li>When slaves can’t keep up with replication </li></ul></ul><ul><ul><li>They’re too busy: </li></ul></ul><ul><ul><ul><li>Reading (production traffic) </li></ul></ul></ul><ul><ul><ul><li>Writing (replication) </li></ul></ul></ul><ul><ul><li>Manifests as: </li></ul></ul><ul><ul><ul><li>Comments/photos/any user-entered data doesn’t show up on the site right away </li></ul></ul></ul><ul><ul><ul><li>So users will repeat the action, thinking that it didn’t “take” the first time, makes situation worse </li></ul></ul></ul>
  11. 11. Hardware Layouts for LAMP Installations Insert funny photo here about slave lag* *slave lag isn’t funny
  12. 12. Hardware Layouts for LAMP Installations <ul><li>Hardware Load Balancing MySQL </li></ul>
  13. 13. Hardware Layouts for LAMP Installations <ul><li>How It’s Usually Done </li></ul><ul><ul><ul><li>Standard MySQL master/slave replication </li></ul></ul></ul><ul><ul><ul><li>All writes (inserts/updates/deletes) from application go to Master </li></ul></ul></ul><ul><ul><ul><li>All reads (selects) from application go to a load-balanced VIP (virtual IP) spreading out load across all slaves </li></ul></ul></ul>
  14. 14. Hardware Layouts for LAMP Installations
  15. 15. Hardware Layouts for LAMP Installations <ul><li>What Is Good About Load Balancing </li></ul><ul><ul><ul><li>you can add/remove slaves without affecting application, since queries are atomic (sorta/kinda) </li></ul></ul></ul><ul><ul><ul><li>additional monitoring point and some automatic failure handling </li></ul></ul></ul><ul><ul><ul><li>you can treat all of your slave pool as one resource, and makes capacity planning a lot easier if you know the ceiling of each slave </li></ul></ul></ul>
  16. 16. Hardware Layouts for LAMP Installations <ul><li>How do you know the ceiling (maximum QPS capacity) of each slave ? </li></ul><ul><ul><ul><li>First make a guess based on benchmarking (or look up some bench results from Tom’s Hardware or anandtech.com, etc. </li></ul></ul></ul><ul><ul><ul><li>Then get more machines than that :) </li></ul></ul></ul><ul><ul><ul><li>Scary: in production during a lull in traffic, remove machines from the pool until you detect lag </li></ul></ul></ul><ul><ul><ul><li>The QPS you saw right before slave lag set in: </li></ul></ul></ul><ul><ul><ul><li>THAT is your ceiling </li></ul></ul></ul>
  17. 17. Hardware Layouts for LAMP Installations
  18. 18. Hardware Layouts for LAMP Installations <ul><li>What Can Be Bad/Tough About Load Balancing: </li></ul><ul><ul><ul><li>not all load-balancers are created equal, not all load-balancing companies expect this product use, so support may still be thin </li></ul></ul></ul><ul><ul><ul><li>not that many people are doing it in high-volume situations yet, so support from community isn’t large either </li></ul></ul></ul><ul><ul><ul><li>Gotchas: </li></ul></ul></ul><ul><ul><ul><ul><li>port exhaustion, </li></ul></ul></ul></ul><ul><ul><ul><ul><li>health checks, </li></ul></ul></ul></ul><ul><ul><ul><ul><li>and balance algorithms </li></ul></ul></ul></ul>
  19. 19. Hardware Layouts for LAMP Installations <ul><li>Port Exhaustion </li></ul><ul><ul><li>PROBLEM: </li></ul></ul><ul><ul><ul><li>LB is basically a traffic cop, nothing more </li></ul></ul></ul><ul><ul><ul><li>Side effect of having a lot of connections: only ~64,511 ports per each IP (VIP) to use </li></ul></ul></ul><ul><ul><ul><li>64,511 ports/120 sec per port…. </li></ul></ul></ul><ul><ul><ul><li>~535 max concurrent connections per IP* </li></ul></ul></ul>* Not really, but close to it: tcp_tw_recycle and tcp_tw_reuse
  20. 20. Hardware Layouts for LAMP Installations
  21. 21. Hardware Layouts for LAMP Installations <ul><li>Port Exhaustion (cont’d) </li></ul><ul><ul><li>SOLUTION: </li></ul></ul><ul><ul><ul><li>Use a pool of IPs on the database slave/farm side (Netscaler calls these “subnet IPs”, Alteon calls them “PiPs”) </li></ul></ul></ul><ul><ul><ul><li>Monitor port/connection usage, know when it’s time to add more </li></ul></ul></ul>
  22. 22. Hardware Layouts for LAMP Installations <ul><li>Health checks </li></ul><ul><ul><ul><li>LB won’t know anything about how well each MySQL slave is doing, and will pass traffic as long as port 3306 is answering </li></ul></ul></ul><ul><ul><ul><li>Load balancers don’t talk SQL, only things like plain old TCP, HTTP/S, maybe FTP </li></ul></ul></ul>
  23. 23. Hardware Layouts for LAMP Installations <ul><li>Health checks (cont’d) </li></ul><ul><ul><ul><li>Two options: </li></ul></ul></ul><ul><ul><ul><ul><li>1. Dirty, but workable: </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Have each server monitor itself, and shut off/firewall its own port 3306, even if MySQL is still running </li></ul></ul></ul></ul></ul>
  24. 24. Hardware Layouts for LAMP Installations <ul><li>Health checks (cont’d) </li></ul><ul><ul><ul><li>2. Cleaner, but a bit more work: </li></ul></ul></ul><ul><ul><ul><ul><ul><li>Have each server monitor itself, and run a check via xinetd (for example, a nagios monitor) </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>So the LB can tickle that port, and expect back an “OK” string. If not, it’ll automatically take that server out of the pool </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Good for detecting and counteracting isolated incidents of ‘slave lag’ and automatically handling it </li></ul></ul></ul></ul></ul>
  25. 25. Hardware Layouts for LAMP Installations Health Checks
  26. 26. Hardware Layouts for LAMP Installations <ul><li>Balancing Algorithms </li></ul><ul><ul><ul><li>Load balancers know HTTP, FTP, basic TCP, but not SQL </li></ul></ul></ul><ul><ul><ul><li>Two things to care about: </li></ul></ul></ul><ul><ul><ul><ul><li>Should the server still be in the pool ? (health checks) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>How should load get balanced ? </li></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>“ least connections” or “least bandwidth” or “least anything” = BAD </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Because not all SQL queries are created equal </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>Use “round-robin” or “random” </li></ul></ul></ul></ul></ul><ul><ul><ul><ul><ul><li>What happens if you don’t: Evil Favoritism™ </li></ul></ul></ul></ul></ul>
  27. 27. Hardware Layouts for LAMP Installations Evil Favoritism
  28. 28. Hardware Layouts for LAMP Installations
  29. 29. Hardware Layouts for LAMP Installations <ul><li>Meanwhile….for “in-the-box considerations” </li></ul><ul><ul><li>Interleaving memory *does* make a difference </li></ul></ul><ul><ul><li>Always RAID10 (or RAID0 if you’re crazy*) but NEVER RAID5 (for Innodb, anyway) </li></ul></ul><ul><ul><li>RAID10 has much more read capacity, and a write penalty , but not as much as RAID5 </li></ul></ul><ul><ul><li>Always have battery backup for HW RAID write caching </li></ul></ul><ul><ul><li>Or, don’t use write caching at all </li></ul></ul>
  30. 30. Hardware Layouts for LAMP Installations <ul><li>“ IN-THE-BOX” considerations (cont’d) </li></ul><ul><ul><li>Always have proper monitoring (nagios, etc.) for failed/rebuilding drives </li></ul></ul><ul><ul><li>SATA or SCSI ? SCSI ! It’s worth it! </li></ul></ul><ul><ul><li>10k or 15k RPM SCSI ? 15k! It’s worth it! </li></ul></ul><ul><ul><li>(~20% performance increase when you’re disk bound) </li></ul></ul><ul><ul><li>For 64bit Linux (AMD64 or EM64T): </li></ul></ul><ul><ul><ul><li>Crank up the RAM for Innodb’s buffer pool </li></ul></ul></ul><ul><ul><ul><li>Swapping = very very bad either: </li></ul></ul></ul><ul><ul><ul><ul><li>Turn it off (slightly scary) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Leave it on and set /proc/sys/vm/swapiness = 0 </li></ul></ul></ul></ul>
  31. 31. Hardware Layouts for LAMP Installations <ul><li>10k versus 15k drives ? </li></ul><ul><li>Does it really matter that much ? </li></ul><ul><li>Some in-the-wild proof…. </li></ul>
  32. 32. Hardware Layouts for LAMP Installations 10K drives 15K drives Slave Lag in production
  33. 33. Hardware Layouts for LAMP Installations <ul><li>Using MySQL with a SAN (Storage Area Network) </li></ul><ul><ul><li>Do layout storage same as if they would be local </li></ul></ul><ul><ul><li>Do make sure that the HBA (fiber card) driver is well supported by Linux </li></ul></ul><ul><ul><li>Don’t share volumes across databases </li></ul></ul><ul><ul><li>Don’t forget to correctly tune Queue Depth Size, which should be increasing, from server HBA -> switch -> storage </li></ul></ul>
  34. 34. Hardware Layouts for LAMP Installations <ul><li>Caching your static content </li></ul>
  35. 35. Hardware Layouts for LAMP Installations <ul><li>Caching Static Content </li></ul><ul><ul><li>SQUID = good </li></ul></ul><ul><ul><li>Relieve your front-end PHP machines from looking up data that will never (or rarely) change </li></ul></ul><ul><ul><li>Generate static pages, and cache them in squid, along with your images </li></ul></ul>
  36. 36. Hardware Layouts for LAMP Installations <ul><li>Caching Static Content (cont’d) </li></ul><ul><ul><li>Use SQUID to accelerate plain-old origin webservers, also known as “reverse-proxy” HTTP acceleration </li></ul></ul><ul><ul><li>Described here and elsewhere: </li></ul></ul><ul><ul><li>http://www.squid-cache.org/Doc/FAQ/FAQ-20.html </li></ul></ul>
  37. 37. Hardware Layouts for LAMP Installations Basic SQUID layout <ul><li>squid accepts requests on 80 </li></ul><ul><li>passes on cache misses to apache on 81 </li></ul><ul><li>apache uses as its docroot an NFS mounted dir </li></ul><ul><li>should be on local subnet, or dedicated net </li></ul>
  38. 38. Hardware Layouts for LAMP Installations <ul><li>Good HW layout for high-volume SQUIDing </li></ul><ul><ul><li>Do use SCSI, and many spindles for disk cache dirs </li></ul></ul><ul><ul><li>Don’t use RAID </li></ul></ul><ul><ul><li>Do use network attached storage, or place the origin servers on separate machines </li></ul></ul><ul><ul><li>Do use ext3 with noatime for disk cache dirs </li></ul></ul><ul><ul><li>Do monitor squid stats </li></ul></ul>
  39. 39. Hardware Layouts for LAMP Installations Flickr: How We Roll
  40. 40. Hardware Layouts for LAMP Installations <ul><li>Yummy SQUID stats: </li></ul><ul><li>>2800 images/sec, ~75-80% are cache hits </li></ul><ul><li>~10 million photos cached at any time </li></ul><ul><li>1.5 million cached in memory </li></ul>
  41. 41. Hardware Layouts for LAMP Installations <ul><li>The End </li></ul>

×