You can't optimize what you cannot measure - Lone Star PHP

11,270 views

Published on

Published in: Technology

You can't optimize what you cannot measure - Lone Star PHP

  1. 1. You Can't Optimize What You Can't Measure Juozas Kaziukėnas // juokaz.com // @juokaz
  2. 2. Juozas Kaziukėnas, Lithuanian You can call me Joe More info http://juokaz.com
  3. 3. Why?
  4. 4. Data
  5. 5. Looking for lies
  6. 6. Looking for lies
  7. 7. Looking for lies
  8. 8. Data • Metrics • Removes subjective decisions • Can be aggregated and related • If it’s 0, it is 0 • Tesla was probably right
  9. 9. Debugging production
  10. 10. Debugging production • Behavioral patterns • When something changes - something is not right • You better notice it • Facebook deployment process • What caused it?
  11. 11. What happened?
  12. 12. What happened? • Previous and current state • Events • Correlated events • State information
  13. 13. What happened here?
  14. 14. What happened here?
  15. 15. Nothing happened here
  16. 16. Something happened here
  17. 17. Logs suck
  18. 18. Logs suck • Someone needs to be checking them • Need to be aggregated • Need to be visualized • High I/O to write • Distributed logs?
  19. 19. Logs don’t suck • Logging exceptions • Detailed information about what happened • If used effectively (processing)
  20. 20. Logs don’t suck
  21. 21. I want to sleep
  22. 22. I want to sleep • Call me when things go wrong • Otherwise everything is working • Things don’t break silently anymore
  23. 23. Business problems
  24. 24. Business problems • Detecting when business tools stop working • No PHP errors, no database errors • Failures of APIs, empty responses, invalid data • Things stop working silently
  25. 25. Counting and timing
  26. 26. Counting and timing • Record when something happens • Record how long it takes for something to happen • Use this to know how many things are happening
  27. 27. The solution
  28. 28. Statsd
  29. 29. Statsd • Counters and timing • No need to initialize or set up counters • Non-blocking writes • Originally written by Etsy.com • Just works • https://github.com/etsy/statsd/
  30. 30. StatsD::increment("coffee.melbourne");
  31. 31. $start = microtime(true); have_a_coffee(); $spent = (microtime(true) - $start) * 1000; StatsD::timing("coffee.timespent", $spent);
  32. 32. How it works
  33. 33. Graphite
  34. 34. Graphite • Real-time charts • Data collection • Data aggregation • Specialized database • http://graphite.wikidot.com/
  35. 35. How it works
  36. 36. Lobster
  37. 37. Logster
  38. 38. Logster • Parse log files • Send data to graphite • Integration with existing applications easier • Also from Etsy.com • https://github.com/etsy/logster
  39. 39. DataDogHQ.com
  40. 40. DataDogHQ.com
  41. 41. DataDogHQ.com • Hosted solution • Collect data from statsd • Store and aggregate from multiple servers • Chart combining any data • Real time charts • Alerts
  42. 42. Amazon outage
  43. 43. Amazon outage
  44. 44. Amazon outage
  45. 45. Amazon outage
  46. 46. How I use this
  47. 47. Web spiders
  48. 48. Web spiders • ~250 nodes • Couple thousand requests per second • Increasing throughput - main goal • Increasing reliability - secondary goal • Metrics for: request time, error rate, error types, proxy failures, unknown responses, etc.
  49. 49. Web spiders • Performance increased 1000% in 3 months • Reliability increased to being 24/7 stable • I can sleep
  50. 50. Wrapping up
  51. 51. Wrapping up • Measure things • Use statsd to collect data • Graph it • Sleep
  52. 52. THANKS! Juozas Kaziukėnas @juokaz

×