Your SlideShare is downloading. ×
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Metrics driven engineering (velocity 2011)
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Metrics driven engineering (velocity 2011)

2,180

Published on

Published in: Technology
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,180
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
38
Comments
0
Likes
6
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. METRICS-DRIVEN ENGINEERING at Kellan Elliott-McCrea, VP of Eng. kellan@etsy.com @kellanTuesday, June 5, 12
  • 2. Tuesday, June 5, 12
  • 3. Tuesday, June 5, 12
  • 4. What is Etsy?Tuesday, June 5, 12
  • 5. 8.5+ million items in the marketplaceTuesday, June 5, 12
  • 6. 400,000+ activeTuesday, June 5, 12
  • 7. $300+ million in sales in 2010 ~$41 million/monthTuesday, June 5, 12
  • 8. > $1000 / minuteTuesday, June 5, 12
  • 9. > 1 billion page views / monthTuesday, June 5, 12
  • 10. business in over 150 countriesTuesday, June 5, 12
  • 11. deploy the site, every ~20 minutesTuesday, June 5, 12
  • 12. engineering team grew ~4x in 2010Tuesday, June 5, 12
  • 13. Metrics?Tuesday, June 5, 12
  • 14. Logs, Graphs, Trends, and CorrelationsTuesday, June 5, 12
  • 15. Metrics Driven?Tuesday, June 5, 12
  • 16. Making DecisionsTuesday, June 5, 12
  • 17. How many visitors are using this thing?Tuesday, June 5, 12
  • 18. Can we deploy that to 100% of our visitors?Tuesday, June 5, 12
  • 19. Did we make it faster?Tuesday, June 5, 12
  • 20. Did I just break something?Tuesday, June 5, 12
  • 21. Q. WHO MAKES THESE GRAPHS? A. Well,racksOps team manages thethe network, the the servers, installed monitoring tools, wears the pagers, blah, blah, blah...Tuesday, June 5, 12
  • 22. but... Engineers build the application.Tuesday, June 5, 12
  • 23. Dev + OpsTuesday, June 5, 12
  • 24. ACCESSTuesday, June 5, 12
  • 25. Yes! No.Tuesday, June 5, 12
  • 26. “Engineers are too busy!”Tuesday, June 5, 12
  • 27. Here’s the BIG SECRET...Tuesday, June 5, 12
  • 28. ... MAKE IT EASY!Tuesday, June 5, 12
  • 29. Simple, open source toolsTuesday, June 5, 12
  • 30. Cacti (network, SNMP) Ganglia (machines) Graphite (application) Splunk (log analysis, nightly reports) Nagios (alerting)Tuesday, June 5, 12
  • 31. Gan ★cluster oriented ★huge community contributed recipes ★2.0 released today (including several Flickr and Etsy patches!) ★gmetad makes it easy to track custom metricsTuesday, June 5, 12
  • 32. Tuesday, June 5, 12
  • 33. Graphite ★super flexible collection and display ★per metrics buckets ★single instance ★super easy to write and use custom display functionsTuesday, June 5, 12
  • 34. LoggingTuesday, June 5, 12
  • 35. Logger::log_error("User login failed. Reason: $msg for $username", “login”);Tuesday, June 5, 12
  • 36. web0054 [Fri Mar 04 16:27:48 2011] [error] [login] [14531658] User login failed. Reason: wrong password for ...Tuesday, June 5, 12
  • 37. web0054 [Fri Mar 04 16:27:48 2011] [error] [login] [14531658] User login failed. Reason: wrong password for ...Tuesday, June 5, 12
  • 38. web0054 [Fri Mar 04 16:27:48 2011] [error] [login] [14531658] User login failed. Reason: wrong password for ...Tuesday, June 5, 12
  • 39. web0054 [Fri Mar 04 16:27:48 2011] [info] [login] [14531658] User login failed. Reason: wrong password for ...Tuesday, June 5, 12
  • 40. web0054 [Fri Mar 04 16:27:48 2011] [info] [login] [14531658] User login failed. Reason: wrong password for ...Tuesday, June 5, 12
  • 41. web0054 [Fri Mar 04 16:27:48 2011] [info] [login] [14531658] User login failed. Reason: wrong password for ...Tuesday, June 5, 12
  • 42. Counting and Timing http://code.flickr.com/blog/ 2008/10/27/counting-timing/Tuesday, June 5, 12
  • 43. LogsterTuesday, June 5, 12
  • 44. Logster https://github.com/etsy/logsterTuesday, June 5, 12
  • 45. Forked from ganglia-logtailer : - Daemon mode (only cron mode) + Support for Graphite + Simplified parsing scriptsTuesday, June 5, 12
  • 46. web0001 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh! web0001 [04:28:54 2011] [error] [client 10.101.x.x] Help me, Rhonda. web0001 [04:28:54 2011] [error] [client 10.101.x.x] Oh noooooo! web0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh! web0001 [04:28:54 2011] [error] [client 10.101.x.x] Heeeeeeellllllllllllllppppp! web0001 [04:28:54 2011] [error] [client 10.101.x.x] Oh noooooo! web0001 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh! web0201 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh! web0034 [04:28:54 2011] [warning] [client 10.101.x.x] Oh nooooooooooo web0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!! web1101 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!! web0201 [04:28:54 2011] [error] [client 10.101.x.x] Youve been eaten by a grue. web0055 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!!! web0002 [04:28:54 2011] [warning] [client 10.101.x.x] Sky is falling. web0089 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!! web0020 [04:28:54 2011] [error] [client 10.101.x.x] Sky is falling. web1101 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh! web0055 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh! web0001 [04:28:54 2011] [warning] [client 10.101.x.x] Oh nooooooooooo web0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!! web0034 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!! web0087 [04:28:54 2011] [fatal] [client 10.101.x.x] Sky is falling. web0002 [04:28:54 2011] [error] [client 10.101.x.x] Oh noooooo! web0201 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh! web0077 [04:28:54 2011] [warning] [client 10.101.x.x] Gaaaaahhh! web0355 [04:28:54 2011] [warning] [client 10.101.x.x] Oh nooooooooooo web0052 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!! web0001 [04:28:54 2011] [error] [client 10.101.x.x] Gaaaaahhh!!! web0003 [04:28:54 2011] [error] [client 10.101.x.x] Youve been eaten by a grue. web0066 [04:28:54 2011] [fatal] [client 10.101.x.x] Gaaaaahhh!!! web0001 [04:28:54 2011] [warning] [client 10.101.x.x] Sky is fallingTuesday, June 5, 12
  • 47. Fatals Errors WarningsTuesday, June 5, 12
  • 48. ★runs out of cron ★maintains a cursor into log files ★supports ganglia and graphite ★custom parsers much easier to write then gmetadTuesday, June 5, 12
  • 49. Apache access logsTuesday, June 5, 12
  • 50. LogFormat "%h %l %u %t "%r" %>s %b" commonTuesday, June 5, 12
  • 51. LogFormat "%{X-Forwarded-For}i % {True-Client-IP}i %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i" % {etsy_shop_id}n %{etsy_uaid}n %V % {etsy_ab_selections}n % {etsy_request_uuid}n % {etsy_api_consumer_key}n % {etsy_api_method_name}n % {php_memory_usage_bytes}n % {php_time_microsec}n %D" combinedTuesday, June 5, 12
  • 52. %{etsy_ab_selections}nTuesday, June 5, 12
  • 53. %{etsy_uaid}nTuesday, June 5, 12
  • 54. GraphsTuesday, June 5, 12
  • 55. “If Engineering at Etsy has a religion, it’s the Church of Graphs. If it moves, we track it.” - Erik Kastner http://codeascraft.etsy.com/2011/02/15/measure- anything-measure-everything/Tuesday, June 5, 12
  • 56. Tuesday, June 5, 12
  • 57. StatsDTuesday, June 5, 12
  • 58. StatsD https://github.com/ etsy/statsd/Tuesday, June 5, 12
  • 59. StatsD::increment("logins.success"); StatsD::timing("gearman.time", $msec);Tuesday, June 5, 12
  • 60. 90th pct average lower StatsD::timing("gearman.time", $msec);Tuesday, June 5, 12
  • 61. Ad hoc name value timestampTuesday, June 5, 12
  • 62. echo "events.deploy.site 1 `date +%s`" | nc graphite.etsycorp.com 2003Tuesday, June 5, 12
  • 63. CorrelationsTuesday, June 5, 12
  • 64. echo "events.deploy.site 1 `date +%s`" | nc graphite.etsycorp.com 2003Tuesday, June 5, 12
  • 65. Trends + Events target=drawAsInfinite(events.deploy.site)Tuesday, June 5, 12
  • 66. What Happened?Tuesday, June 5, 12
  • 67. Holt-WintersTuesday, June 5, 12
  • 68. "Forecasting Sales by Exponentially Weighted Moving Averages". PeterTuesday, June 5, 12
  • 69. "Aberrant Behavior Detection in Time Series for Network Monitoring".Tuesday, June 5, 12
  • 70. "Holt-Winters Forecasting Applied to Poisson Processes in Real-Time".Tuesday, June 5, 12
  • 71. holtWintersConfidence(Upper|Lower)Tuesday, June 5, 12
  • 72. holtWintersAberrationTuesday, June 5, 12
  • 73. business metrics with confidence bands == alertable business metricsTuesday, June 5, 12
  • 74. 16,000 metrics in GRAPHITE (plus 32,000 metrics in GANGLIA)Tuesday, June 5, 12
  • 75. 16,000 metrics in GRAPHITE (plus 32,000 metrics in GANGLIA)Tuesday, June 5, 12
  • 76. DashboardsTuesday, June 5, 12
  • 77. DashboardsTuesday, June 5, 12
  • 78. DashboardsTuesday, June 5, 12
  • 79. Hard <a href="http://graphite.etsycorp.com/render? from=-1hours&width=800&height=600&title=File+or+Script+Not +Found&yMin=0&target=webs.errorLog.notExist&target=drawAsInfinite %28deploys.config.production%29&target=drawAsInfinite%28deploys.web.production %29&target=drawAsInfinite%28deploys.search.production%29&target=drawAsInfinite %28deploys.imagestorage.other%29&colorList=%2300cc00,%230000ff, %23ff0000,%23006633,%23cc6600"> <img src="http://graphite.etsycorp.com/render? from=-1hours&width=280&height=220&title=File+or+Script+Not +Found&hideLegend=1&yMin=0&target=webs.errorLog.notExist&target=drawAsInfinite %28deploys.config.production%29&target=drawAsInfinite%28deploys.web.production %29&target=drawAsInfinite%28deploys.search.production%29&target=drawAsInfinite %28deploys.imagestorage.other%29&colorList=%2300cc00,%230000ff, %23ff0000,%23006633,%23cc6600"> </a>Tuesday, June 5, 12
  • 80. Easy! $g = new Graphite($time); $g->setTitle(File Not Found); $g->addMetric(webs.errorLog.notExist, #00cc00); $g->showDeploys(true); echo $g->getDashboardHTML(280, 220);Tuesday, June 5, 12
  • 81. 48 dashboards by 32 engineersTuesday, June 5, 12
  • 82. Application healthTuesday, June 5, 12
  • 83. High-level visibilityTuesday, June 5, 12
  • 84. Low MTTDTuesday, June 5, 12
  • 85. ConfidenceTuesday, June 5, 12
  • 86. Make metricsTuesday, June 5, 12
  • 87. Make metricsTuesday, June 5, 12
  • 88. Make metricsTuesday, June 5, 12
  • 89. Not that muchTuesday, June 5, 12
  • 90. codeascraft.etsy.com github.com/etsy/statsd github.com/etsy/logster bitbucket.org/maplebed/ganglia- logtailerTuesday, June 5, 12
  • 91. Questions?Tuesday, June 5, 12

×