Making performant sites

2,297 views

Published on

How a developer and a hoster should work together to get a good uptime, a performant site and be prepared to scale

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,297
On SlideShare
0
From Embeds
0
Number of Embeds
54
Actions
Shares
0
Downloads
115
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Making performant sites

  1. 1. Performant sites You and your hoster want your site to perform well
  2. 2. me • Bernard Grymonpon - Wonko • bernard@openminds.be • Partner in Openminds & Metatale • Sysadmin - Web-engineer • Openminds offers high-quality, high- performance internetsolutions
  3. 3. Todays talk
  4. 4. What a good site needs • Performance • Availability • Scaling (if needed) • Bandwidth
  5. 5. Problems: 1.0 to 2.0 • Serving HTML was easy, but... • A lot of hits (blame Web 2.0) • Each hit: processing PHP/Rails/Django • Each hit: reading and writing like crazy • Server response speed is driving the User Experience
  6. 6. Ajax requests RSS polls Including content User content everywhere More problems!
  7. 7. Fast sites • Why • Basic system administration • Some cases • Working together on Uptime, Performance and scaling • Larger scaling & some examples
  8. 8. One common goal Your hoster wants what you want... Getting the content where it should be, ASAP!
  9. 9. Different reasons
  10. 10. Your reasons • You paid for it • Your site is important • You promised it to a client • Sleep confident at night
  11. 11. Your hosters reasons • Offer a stable service, for everyone • Clear the way for other requests • Time to invest in other projects • Sleep at night • Profit
  12. 12. Fast sites • Why • Basic system administration • Some cases • Working together on Uptime, Performance and scaling • Larger scaling & some examples
  13. 13. Hosting your site Getting to the “performance” part...
  14. 14. We use servers • Processors (php, rails, django, OS) • Storage (files, database, logging) • Memory (needed for speed) • Casing, some fans, circuitry... (because putting it all in a cardboard box doesn’t work)
  15. 15. My site is slow Put a faster processor in the server, ASAP
  16. 16. My site is slow Put a faster processor in the server, ASAP Guess again...
  17. 17. My site is slow Put a faster processor in the server, ASAP Guess again... Stay focused, technical stuff coming up...
  18. 18. CPU can be a problem procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- rb swpd free buff cache si so bi bo in cs us sy id wa 70 21092 165200 37316 1585388 0 0 1 0 0 1 50 5 45 0 60 21092 163664 37316 1585388 0 0 0 0 142 279 73 26 1 0 60 21092 168840 37316 1585392 0 0 0 0 148 330 75 25 0 0 70 21092 165264 37316 1585392 0 0 0 316 235 245 75 25 0 0 60 21092 160828 37316 1585392 0 0 0 0 153 277 73 27 0 0 60 21092 168688 37316 1585396 0 0 0 0 149 383 78 22 0 0 60 21092 165040 37316 1585396 0 0 0 0 141 179 76 24 0 0 60 21092 169188 37316 1585396 0 0 0 0 143 264 77 23 0 0 60 21092 168376 37316 1585396 0 0 0 360 264 221 75 25 0 0
  19. 19. CPU can be a problem procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- rb swpd free buff cache si so bi bo in cs us sy id wa 70 21092 165200 37316 1585388 0 0 1 0 0 1 50 5 45 0 60 21092 163664 37316 1585388 0 0 0 0 142 279 73 26 1 0 60 21092 168840 37316 1585392 0 0 0 0 148 330 75 25 0 0 70 21092 165264 37316 1585392 0 0 0 316 235 245 75 25 0 0 60 21092 160828 37316 1585392 0 0 0 0 153 277 73 27 0 0 60 21092 168688 37316 1585396 0 0 0 0 149 383 78 22 0 0 60 21092 165040 37316 1585396 0 0 0 0 141 179 76 24 0 0 60 21092 169188 37316 1585396 0 0 0 0 143 264 77 23 0 0 60 21092 168376 37316 1585396 0 0 0 360 264 221 75 25 0 0 working like running crazy
  20. 20. CPU can be a problem procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- rb swpd free buff cache si so bi bo in cs us sy id wa 70 21092 165200 37316 1585388 0 0 1 0 0 1 50 5 45 0 60 21092 163664 37316 1585388 0 0 0 0 142 279 73 26 1 0 60 21092 168840 37316 1585392 0 0 0 0 148 330 75 25 0 0 70 21092 165264 37316 1585392 0 0 0 316 235 245 75 25 0 0 60 21092 160828 37316 1585392 0 0 0 0 153 277 73 27 0 0 60 21092 168688 37316 1585396 0 0 0 0 149 383 78 22 0 0 60 21092 165040 37316 1585396 0 0 0 0 141 179 76 24 0 0 60 21092 169188 37316 1585396 0 0 0 0 143 264 77 23 0 0 60 21092 168376 37316 1585396 0 0 0 360 264 221 75 25 0 0 heaps of working like running free memory crazy
  21. 21. CPU can be a problem procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- rb swpd free buff cache si so bi bo in cs us sy id wa 70 21092 165200 37316 1585388 0 0 1 0 0 1 50 5 45 0 60 21092 163664 37316 1585388 0 0 0 0 142 279 73 26 1 0 60 21092 168840 37316 1585392 0 0 0 0 148 330 75 25 0 0 70 21092 165264 37316 1585392 0 0 0 316 235 245 75 25 0 0 60 21092 160828 37316 1585392 0 0 0 0 153 277 73 27 0 0 60 21092 168688 37316 1585396 0 0 0 0 149 383 78 22 0 0 60 21092 165040 37316 1585396 0 0 0 0 141 179 76 24 0 0 60 21092 169188 37316 1585396 0 0 0 0 143 264 77 23 0 0 60 21092 168376 37316 1585396 0 0 0 360 264 221 75 25 0 0 heaps of working like running no I/O free memory crazy
  22. 22. I/O is slowing down
  23. 23. I/O is slowing down procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- rb swpd free buff cache si so bi bo in cs us sy id wa 0 0 2034316 204240 376 518220 0 0 448 0 532 3208 0 0 99 0 0 0 2034316 203792 376 518668 0 0 448 72 576 3214 0 1 99 0 0 0 2034316 203172 376 519156 0 0 488 0 617 3328 0 1 99 0 0 2 2034300 193644 376 520076 0 0 860 0 608 3452 29 3 67 0 2 0 2034300 185996 376 521448 0 0 1032 4 624 3955 33 5 62 0 1 0 2034300 185944 376 521476 0 0 0 24 296 3600 14 1 84 0 1 0 2034300 185924 376 521496 0 0 0 72 559 3648 10 3 87 0 0 0 2034300 185928 376 521496 0 0 0 0 233 3221 3 1 96 0 1 1 2034300 177192 376 521560 0 0 64 8 253 3658 25 4 71 0 0 0 2034300 177664 376 521572 0 0 12 12 249 3415 12 2 86 0
  24. 24. I/O is slowing down procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- rb swpd free buff cache si so bi bo in cs us sy id wa 0 0 2034316 204240 376 518220 0 0 448 0 532 3208 0 0 99 0 0 0 2034316 203792 376 518668 0 0 448 72 576 3214 0 1 99 0 0 0 2034316 203172 376 519156 0 0 488 0 617 3328 0 1 99 0 0 2 2034300 193644 376 520076 0 0 860 0 608 3452 29 3 67 0 2 0 2034300 185996 376 521448 0 0 1032 4 624 3955 33 5 62 0 1 0 2034300 185944 376 521476 0 0 0 24 296 3600 14 1 84 0 1 0 2034300 185924 376 521496 0 0 0 72 559 3648 10 3 87 0 0 0 2034300 185928 376 521496 0 0 0 0 233 3221 3 1 96 0 1 1 2034300 177192 376 521560 0 0 64 8 253 3658 25 4 71 0 0 0 2034300 177664 376 521572 0 0 12 12 249 3415 12 2 86 0 cpu is not buffering in action that stressed
  25. 25. I/O is slowing down procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- rb swpd free buff cache si so bi bo in cs us sy id wa 0 0 2034316 204240 376 518220 0 0 448 0 532 3208 0 0 99 0 0 0 2034316 203792 376 518668 0 0 448 72 576 3214 0 1 99 0 0 0 2034316 203172 376 519156 0 0 488 0 617 3328 0 1 99 0 0 2 2034300 193644 376 520076 0 0 860 0 608 3452 29 3 67 0 2 0 2034300 185996 376 521448 0 0 1032 4 624 3955 33 5 62 0 1 0 2034300 185944 376 521476 0 0 0 24 296 3600 14 1 84 0 1 0 2034300 185924 376 521496 0 0 0 72 559 3648 10 3 87 0 0 0 2034300 185928 376 521496 0 0 0 0 233 3221 3 1 96 0 1 1 2034300 177192 376 521560 0 0 64 8 253 3658 25 4 71 0 0 0 2034300 177664 376 521572 0 0 12 12 249 3415 12 2 86 0 cpu is not running buffering in action that stressed
  26. 26. I/O is slowing down procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- rb swpd free buff cache si so bi bo in cs us sy id wa 0 0 2034316 204240 376 518220 0 0 448 0 532 3208 0 0 99 0 0 0 2034316 203792 376 518668 0 0 448 72 576 3214 0 1 99 0 0 0 2034316 203172 376 519156 0 0 488 0 617 3328 0 1 99 0 0 2 2034300 193644 376 520076 0 0 860 0 608 3452 29 3 67 0 2 0 2034300 185996 376 521448 0 0 1032 4 624 3955 33 5 62 0 1 0 2034300 185944 376 521476 0 0 0 24 296 3600 14 1 84 0 1 0 2034300 185924 376 521496 0 0 0 72 559 3648 10 3 87 0 0 0 2034300 185928 376 521496 0 0 0 0 233 3221 3 1 96 0 1 1 2034300 177192 376 521560 0 0 64 8 253 3658 25 4 71 0 0 0 2034300 177664 376 521572 0 0 12 12 249 3415 12 2 86 0 cpu is not running buffering in action I/O that stressed
  27. 27. I/O is slowing down procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- rb swpd free buff cache si so bi bo in cs us sy id wa 0 0 2034316 204240 376 518220 0 0 448 0 532 3208 0 0 99 0 0 0 2034316 203792 376 518668 0 0 448 72 576 3214 0 1 99 0 0 0 2034316 203172 376 519156 0 0 488 0 617 3328 0 1 99 0 0 2 2034300 193644 376 520076 0 0 860 0 608 3452 29 3 67 0 2 0 2034300 185996 376 521448 0 0 1032 4 624 3955 33 5 62 0 1 0 2034300 185944 376 521476 0 0 0 24 296 3600 14 1 84 0 1 0 2034300 185924 376 521496 0 0 0 72 559 3648 10 3 87 0 0 0 2034300 185928 376 521496 0 0 0 0 233 3221 3 1 96 0 1 1 2034300 177192 376 521560 0 0 64 8 253 3658 25 4 71 0 0 0 2034300 177664 376 521572 0 0 12 12 249 3415 12 2 86 0 cpu is not running buffering in action I/O that stressed
  28. 28. I/O is slowing down procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- rb swpd free buff cache si so bi bo in cs us sy id wa 0 0 2034316 204240 376 518220 0 0 448 0 532 3208 0 0 99 0 0 0 2034316 203792 376 518668 0 0 448 72 576 3214 0 1 99 0 0 0 2034316 203172 376 519156 0 0 488 0 617 3328 0 1 99 0 0 2 2034300 193644 376 520076 0 0 860 0 608 3452 29 3 67 0 2 0 2034300 185996 376 521448 0 0 1032 4 624 3955 33 5 62 0 1 0 2034300 185944 376 521476 0 0 0 24 296 3600 14 1 84 0 1 0 2034300 185924 376 521496 0 0 0 72 559 3648 10 3 87 0 0 0 2034300 185928 376 521496 0 0 0 0 233 3221 3 1 96 0 1 1 2034300 177192 376 521560 0 0 64 8 253 3658 25 4 71 0 0 0 2034300 177664 376 521572 0 0 12 12 249 3415 12 2 86 0 cpu is not running buffering in action I/O that stressed
  29. 29. I/O is slowing down procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- rb swpd free buff cache si so bi bo in cs us sy id wa 0 0 2034316 204240 376 518220 0 0 448 0 532 3208 0 0 99 0 0 0 2034316 203792 376 518668 0 0 448 72 576 3214 0 1 99 0 0 0 2034316 203172 376 519156 0 0 488 0 617 3328 0 1 99 0 0 2 2034300 193644 376 520076 0 0 860 0 608 3452 29 3 67 0 2 0 2034300 185996 376 521448 0 0 1032 4 624 3955 33 5 62 0 1 0 2034300 185944 376 521476 0 0 0 24 296 3600 14 1 84 0 1 0 2034300 185924 376 521496 0 0 0 72 559 3648 10 3 87 0 0 0 2034300 185928 376 521496 0 0 0 0 233 3221 3 1 96 0 1 1 2034300 177192 376 521560 0 0 64 8 253 3658 25 4 71 0 0 0 2034300 177664 376 521572 0 0 12 12 249 3415 12 2 86 0 cpu is not running buffering in action I/O that stressed
  30. 30. I/O is always slow keep this in mind
  31. 31. I/O is always slow keep this in mind • Filesystem, reading and writing files
  32. 32. I/O is always slow keep this in mind • Filesystem, reading and writing files • Database, reading and writing data
  33. 33. I/O is always slow keep this in mind • Filesystem, reading and writing files • Database, reading and writing data • Services, logging and “doing their thing”
  34. 34. Fast sites • Why • Basic system administration • Some cases • Working together on Uptime, Performance and scaling • Larger scaling & some examples
  35. 35. Performance Getting there at last...
  36. 36. Main problems we see • Bad database design, puts load on I/O • Bad code, puts load on processor • A lot of hits at once, puts load on both • Insecure sites, contact forms, ...
  37. 37. A first example Databases need love and care
  38. 38. Databases • Main page takes +30 seconds to load • Developer used non-realistic small datasets • The index page was the “heaviest” page
  39. 39. Solution: Indexes • Site on a single box, getting slow, using a database? CHECK THE INDEXES! • Solved the problem (0.5 seconds load time) • Easy concept, like an index in a book • Only where needed...
  40. 40. Where? SELECT text FROM articles WHERE category = 5 SELECT text FROM articles JOIN authors ON authers.id = articles.author_id WHERE articles.category = 5
  41. 41. mysql> explain select * from articles where status = 2; +----+-------------+----------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | articles | ALL | NULL | NULL | NULL | NULL | 50000 | Using where | +----+-------------+----------+------+---------------+------+---------+------+-------+-------------+ 1 row in set (0.00 sec) mysql> create index status_idx on articles(status); Query OK, 50000 rows affected (0.31 sec) Records: 50000 Duplicates: 0 Warnings: 0 mysql> explain select * from articles where status = 2; +----+-------------+----------+------+---------------+------------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+------+---------------+------------+---------+-------+------+-------------+ | 1 | SIMPLE | articles | ref | status_idx | status_idx | 2 | const | 4757 | Using where | +----+-------------+----------+------+---------------+------------+---------+-------+------+-------------+ 1 row in set (0.00 sec)
  42. 42. More to databases • MySQL - PostgreSQL - Oracle - ... • MySQL: storage engine choice • Replication • ...
  43. 43. A second case: The mystery of the spike in the graphs
  44. 44. The flow
  45. 45. *beep* *beep* The flow
  46. 46. The flow
  47. 47. wtf! The flow
  48. 48. The flow
  49. 49. Fixed! The flow
  50. 50. The problem • Client has no knowledge of the impact this has on the equipment • Massive hits on the servers (web and db) • Needed fast response from us
  51. 51. What could be done? • Mails could be sent out in batches • Application could be tuned • Server could be “prepared” • Monitoring would be guaranteed • We share the same goal, contact us
  52. 52. Case 3: Blame the media Advertisments on TV are killing servers
  53. 53. Good case
  54. 54. Good case • Server was tuned before the hit • LLMP stack (lighttpd instead of apache) • Nothing happened...
  55. 55. Another case: Bad code kills
  56. 56. Another case: Bad code kills “Dear support,
  57. 57. Another case: Bad code kills “Dear support, I measured it, the query takes 0.001 seconds.
  58. 58. Another case: Bad code kills “Dear support, I measured it, the query takes 0.001 seconds. Executing such a fast query 500.000 times in a row can’t be that hard on your server.
  59. 59. Another case: Bad code kills “Dear support, I measured it, the query takes 0.001 seconds. Executing such a fast query 500.000 times in a row can’t be that hard on your server. I see no point in changing my scripts”...
  60. 60. Another case: Bad code kills “Dear support, I measured it, the query takes 0.001 seconds. Executing such a fast query 500.000 times in a row can’t be that hard on your server. I see no point in changing my scripts”... *sigh*
  61. 61. The query: SELECT child FROM menu WHERE parent = 21
  62. 62. The query: SELECT child FROM menu WHERE parent = 21 Result: 21
  63. 63. The query: SELECT child FROM menu WHERE parent = 21 Result: 21 A perfect loop
  64. 64. Fast sites • Why • Basic system administration • Some cases • Working together on Uptime, Performance and scaling • Larger scaling & some examples
  65. 65. What should we do A common goal, remember?
  66. 66. Uptime, your part • Write secure code • Don’t do include($_GET[‘p’]); • Develop local
  67. 67. Uptime, your hoster • Monitoring! Alerting! Spare parts! Spare servers! • Redundancy, things should failover easily • Backups, and tested restore procedures • Invest in new technology • Invest in people
  68. 68. Performance & you • Write good and efficient code • Remember, I/O is slow • Spend time at DB optimization • Test your application before launch, with normal datasets
  69. 69. Performance & hosting • Monitoring & tuning systems • An ongoing task! • Activate caching where possible • Filesystem level (memory) • Database tuning ...
  70. 70. Prepare to scale • If you start with a new project, split read/ write operations • Be prepared to partition your application • Normalize like crazy, denormalize when needed • Implement, or consider caching
  71. 71. Scaling and hosting • Have spare CPU/IO power available • Invest time in testing setups • Test setups, both on I/O and CPU- performance • Be creative: shared storage, replication, storing sessions, special setups...
  72. 72. A summary for you • Make good applications • Test your applications • Optimize your application in bad conditions • Talk to your hoster • Find bottlenecks before they show up • Normalize first, denormalize only when needed
  73. 73. Check your hoster • Knowledge • Monitoring • Redundancy • Support • Report back to the users
  74. 74. Fast sites • Why • Basic system administration • Some cases • Working together on Uptime, Performance and scaling • Larger scaling & some examples
  75. 75. A quick word on perspective Don’t go nuts
  76. 76. Don’t exagerate • Wikipedia, 200+ servers • Twitter, 8 servers • LifeBook, 100 servers • Digg, approx 100 servers • Google, 450000 servers, 5 sites (??)
  77. 77. Scaling in the real world A quick overview
  78. 78. Common scaling Web SQL
  79. 79. Common scaling Web Web SQL SQL SQL SQL SQL
  80. 80. Common scaling Web Web Web Web Web Web SQL SQL SQL SQL SQL SQL SQL SQL SQL
  81. 81. Common scaling Z Web Web Web Web Web X Web Web Web Web Web SQL SQL SQL X SQL SQL SQL SQL SQL SQL Y SQL SQL SQL SQL Y’
  82. 82. Common solutions • Partitioning the problem • No single bottleneck (no “master server”) • Cache like crazy (memcached) • Redundancy • Balancing
  83. 83. A big project? • Search & hire people with knowledge • Hire/buy the needed equipment • Test-drive the application before going live • Don’t be ashamed to ask for help
  84. 84. LifeJournal • 90+ servers • +50M hits per day, +1k per sec in peak • typical road • partitioning • made memcached
  85. 85. Twitter • Made in rails, running on 8 boxes • Their observations: • indexes • denormalisation • caching, caching, caching, caching • your application should be partitionable
  86. 86. Flickr • Massive storage (remember I/O is slow?) • Wrote own FS, partitioned • Made to scale, separate reading/writing • Cache invalidation is hard
  87. 87. Wikipedia • Serving is the bottleneck • Reverse squids to the rescue • Memcached to the rescue • Partitioned (per language) • Cache invalidation through multicast! • They develop the MediaWiki software to their needs...
  88. 88. Conclusions
  89. 89. Conclusions • Talk to your hoster, he should be a compagnion, not a far enemy
  90. 90. Conclusions • Talk to your hoster, he should be a compagnion, not a far enemy • Performance and scaling demands effort and knowledge
  91. 91. Conclusions • Talk to your hoster, he should be a compagnion, not a far enemy • Performance and scaling demands effort and knowledge • A good site is a combination of many factors (application, code, servers, OS settings, tuning...)
  92. 92. Q &A Discussion Bernard Grymonpon - www.openminds.be

×