Your SlideShare is downloading. ×
Trending with Purpose
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Trending with Purpose

3,385
views

Published on

This talk evalutes some easy ways to extract useful trending and capacity planning out of your existing monitoring investment. Using Nagios performance data, we examine simple behaviors with …

This talk evalutes some easy ways to extract useful trending and capacity planning out of your existing monitoring investment. Using Nagios performance data, we examine simple behaviors with PNP4Nagios and graduate on to more insightful analytics with Graphite. With metrics in hand we look at the questions that IT /should/ be asking, such as:

* What sort of data should I trend?
* Why do I need to trend it?
* How do Operational or Engineering trends relate to Business or Transactional monitoring?
* How does this data impact our customer relationship and/or their bottom-line?

Finally, we look at creative ways to get profiling data out of your production systems with a minimum amount of effort from your development team.

Published in: Technology

0 Comments
8 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,385
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
8
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript

    • 1. Trending with Purpose Jason Dixon
    • 2. Trending
    • 3. A general direction in which something is developing or changing.
    • 4. Why do we trend?
    • 5. Planning for growth.
    • 6. Predicting ordiagnosing failure.
    • 7. ... instead of finding out from your customer.
    • 8. Operational Questions
    • 9. Just because a host or service reponds, how do you know it’s working?
    • 10. If you haven’t measured good, how will you recognize bad?
    • 11. You don’t know what might break, so collect everything now.
    • 12. "Those who cannot remember the past are condemned to repeat it" George Santayana
    • 13. "I dont care if my servers are on fire as long as theyre making me money" Business Owner
    • 14. How can we addBusiness Value?
    • 15. Start simple.
    • 16. Dance with the one who brought you.
    • 17. Nagios
    • 18. Nagios• Fault Detection
    • 19. Nagios• Fault Detection• Notifications
    • 20. Nagios• Fault Detection• Notifications• Escalations
    • 21. Nagios• Fault Detection• Notifications• Escalations• Acknowledgements/Downtime
    • 22. Nagios• Fault Detection• Notifications• Escalations• Acknowledgements/Downtime• http://www.nagios.org/
    • 23. Nagios
    • 24. Nagios• What it does well:
    • 25. Nagios• What it does well: • Free
    • 26. Nagios• What it does well: • Free • Extensible
    • 27. Nagios• What it does well: • Free • Extensible • Plugins
    • 28. Nagios• What it does well: • Free • Extensible • Plugins • Configuration templates
    • 29. Nagios• What it does well: • Free • Extensible • Plugins • Configuration templates • Popular (lesser of all free evils)
    • 30. Nagios• What it does well: • Free • Extensible • Plugins • Configuration templates • Popular (lesser of all free evils) • Log metrics (“performance data”)
    • 31. Nagios
    • 32. Nagios• Where it sucks:
    • 33. Nagios• Where it sucks: • Interface
    • 34. Nagios• Where it sucks: • Interface • (Lack of) Scalability
    • 35. Nagios• Where it sucks: • Interface • (Lack of) Scalability • Promotes bad habits
    • 36. Nagios• Where it sucks: • Interface • (Lack of) Scalability • Promotes bad habits • Acknowledgements never expire
    • 37. Nagios• Where it sucks: • Interface • (Lack of) Scalability • Promotes bad habits • Acknowledgements never expire • Configuration (over-)flexibility
    • 38. Nagios• Where it sucks: • Interface • (Lack of) Scalability • Promotes bad habits • Acknowledgements never expire • Configuration (over-)flexibility • Flapping
    • 39. Nagios
    • 40. PNP4Nagios
    • 41. PNP4Nagios• Basic graphing & dashboard capabilities
    • 42. PNP4Nagios• Basic graphing & dashboard capabilities• Retrieves Nagios performance data
    • 43. PNP4Nagios• Basic graphing & dashboard capabilities• Retrieves Nagios performance data• Creates graphs with RRD
    • 44. PNP4Nagios• Basic graphing & dashboard capabilities• Retrieves Nagios performance data• Creates graphs with RRD• Limited introspection/correlation
    • 45. PNP4Nagios• Basic graphing & dashboard capabilities• Retrieves Nagios performance data• Creates graphs with RRD• Limited introspection/correlation• http://www.pnp4nagios.org/
    • 46. PNP4Nagios
    • 47. Graphite
    • 48. Graphite• Metric storage
    • 49. Graphite• Metric storage• Complex graph creation
    • 50. Graphite• Metric storage• Complex graph creation• Web and “CLI” interfaces
    • 51. Graphite• Metric storage• Complex graph creation• Web and “CLI” interfaces• Created and released by Orbitz.com
    • 52. Graphite• Metric storage• Complex graph creation• Web and “CLI” interfaces• Created and released by Orbitz.com• http://graphite.wikidot.com/
    • 53. Graphite
    • 54. Graphite
    • 55. Graphite• The good stuff:
    • 56. Graphite• The good stuff: • Horizontally scalable
    • 57. Graphite• The good stuff: • Horizontally scalable • Rapid graph prototyping (CLI)
    • 58. Graphite• The good stuff: • Horizontally scalable • Rapid graph prototyping (CLI) • Graph disparate data points
    • 59. Graphite• The good stuff: • Horizontally scalable • Rapid graph prototyping (CLI) • Graph disparate data points • Numerous formulas available
    • 60. Graphite• The good stuff: • Horizontally scalable • Rapid graph prototyping (CLI) • Graph disparate data points • Numerous formulas available • derive, transform, average, sum, etc...
    • 61. Graphite• The good stuff: • Horizontally scalable • Rapid graph prototyping (CLI) • Graph disparate data points • Numerous formulas available • derive, transform, average, sum, etc... • Share graphs with other users
    • 62. Graphite• The good stuff: • Horizontally scalable • Rapid graph prototyping (CLI) • Graph disparate data points • Numerous formulas available • derive, transform, average, sum, etc... • Share graphs with other users • Supports existing RRD databases
    • 63. Graphite
    • 64. Graphite• The not-so-good stuff:
    • 65. Graphite• The not-so-good stuff: • Not a dashboard (well, sorta)
    • 66. Graphite• The not-so-good stuff: • Not a dashboard (well, sorta) • No hover details
    • 67. Graphite• The not-so-good stuff: • Not a dashboard (well, sorta) • No hover details • Single y-axis
    • 68. GraphiteWeb Carbon Metrics Whisper
    • 69. Carbon
    • 70. Carbon• agent - starts other daemons, receives metrics and pipelines them to cache
    • 71. Carbon• agent - starts other daemons, receives metrics and pipelines them to cache• aggregator - aggregate and transform your data before storage
    • 72. Carbon• agent - starts other daemons, receives metrics and pipelines them to cache• aggregator - aggregate and transform your data before storage• cache - caches metrics for real-time graphing, pipelines them to persister
    • 73. Carbon• agent - starts other daemons, receives metrics and pipelines them to cache• aggregator - aggregate and transform your data before storage• cache - caches metrics for real-time graphing, pipelines them to persister• persister - writes persistent data to disk
    • 74. Whisper
    • 75. Whisper• Metrics database format
    • 76. Whisper• Metrics database format• Supplanted RRDtool
    • 77. Whisper• Metrics database format• Supplanted RRDtool• Accepts out-of-order data
    • 78. Whisper• Metrics database format• Supplanted RRDtool• Accepts out-of-order data• Supports pipelining of data in a single operation (multiplexing)
    • 79. Coming Soon - Ceres
    • 80. Coming Soon - Ceres• Smaller files - doesn’t pad missing datapoints
    • 81. Coming Soon - Ceres• Smaller files - doesn’t pad missing datapoints• Doesn’t store timestamps, calculates them
    • 82. Coming Soon - Ceres• Smaller files - doesn’t pad missing datapoints• Doesn’t store timestamps, calculates them• Federates individual datapoints
    • 83. Graphite Web
    • 84. Graphite Web• Traditional web interface
    • 85. Graphite Web• Traditional web interface• Javascript CLI
    • 86. Graphite Web• Traditional web interface• Javascript CLI• Rudimentary dashboard
    • 87. Graphite Web• Traditional web interface• Javascript CLI• Rudimentary dashboard• Django application
    • 88. Sending metrics to Graphite
    • 89. Sending metrics to Graphite• Connect to Carbon socket (tcp/2003)
    • 90. Sending metrics to Graphite• Connect to Carbon socket (tcp/2003)• Send your data
    • 91. Sending metrics to Graphite• Connect to Carbon socket (tcp/2003)• Send your data my $sock = IO::Socket::INET->new(‘127.0.0.1:2003’); $sock->send(“endpoint.app.metric $value $epochn");
    • 92. Sending metrics to Graphite• Connect to Carbon socket (tcp/2003)• Send your data my $sock = IO::Socket::INET->new(‘127.0.0.1:2003’); $sock->send(“endpoint.app.metric $value $epochn");• ???
    • 93. Sending metrics to Graphite• Connect to Carbon socket (tcp/2003)• Send your data my $sock = IO::Socket::INET->new(‘127.0.0.1:2003’); $sock->send(“endpoint.app.metric $value $epochn");• ???• Profit!
    • 94. What should we collect?
    • 95. App/DB Profiling
    • 96. App/DB Profiling• How fast is our:
    • 97. App/DB Profiling• How fast is our: • function foo() for each iteration
    • 98. App/DB Profiling• How fast is our: • function foo() for each iteration • SQL query
    • 99. App/DB Profiling• How fast is our: • function foo() for each iteration • SQL query • 3rd party API service (e.g. payment gateway, social media)
    • 100. App/DB Profiling
    • 101. App/DB Profiling • How many times do we:
    • 102. App/DB Profiling • How many times do we: • call function foo()
    • 103. App/DB Profiling • How many times do we: • call function foo() • register a new user
    • 104. App/DB Profiling • How many times do we: • call function foo() • register a new user • chargeback a sale
    • 105. IT exists to support Business.
    • 106. Not the other way around.
    • 107. Business Profiling
    • 108. Business Profiling• How many sales did we generate this hour:
    • 109. Business Profiling• How many sales did we generate this hour: • per sku
    • 110. Business Profiling• How many sales did we generate this hour: • per sku • from the recent ad campaign
    • 111. Business Profiling• How many sales did we generate this hour: • per sku • from the recent ad campaign • from users in West Bumble, Arkansas
    • 112. Last week?
    • 113. Last month?
    • 114. Last year?
    • 115. Be Creative.
    • 116. More Data > Less Data
    • 117. You probably won’t know what metrics you might need until it’s too late.
    • 118. Don’t be that guy.
    • 119. Storage is Cheap.
    • 120. Data Sources• Nagios • SQL• Munin • Logs• SNMP • sar• Ganglia • /proc• collectd • REST APIs
    • 121. You can’t be serious!
    • 122. You can’t be serious! • You’re thinking to yourself...
    • 123. You can’t be serious! • You’re thinking to yourself... • This sounds like a lot of work.
    • 124. You can’t be serious! • You’re thinking to yourself... • This sounds like a lot of work. • Our developers will never buy in.
    • 125. You can’t be serious! • You’re thinking to yourself... • This sounds like a lot of work. • Our developers will never buy in. • Is there an easier way?
    • 126. Duh.
    • 127. StatsD
    • 128. StatsD• Created and released by Etsy
    • 129. StatsD• Created and released by Etsy• "Measure Anything, Measure Everything"
    • 130. StatsD• Created and released by Etsy• "Measure Anything, Measure Everything"• Aggregate counters and timers
    • 131. StatsD• Created and released by Etsy• "Measure Anything, Measure Everything"• Aggregate counters and timers• Pipeline to Graphite
    • 132. StatsD• Created and released by Etsy• "Measure Anything, Measure Everything"• Aggregate counters and timers• Pipeline to Graphite• Fire-and-forget (UDP)
    • 133. StatsD• Created and released by Etsy• "Measure Anything, Measure Everything"• Aggregate counters and timers• Pipeline to Graphite• Fire-and-forget (UDP)• Perl, PHP, Python and Java clients
    • 134. StatsD• Created and released by Etsy• "Measure Anything, Measure Everything"• Aggregate counters and timers• Pipeline to Graphite• Fire-and-forget (UDP)• Perl, PHP, Python and Java clients• https://github.com/etsy/statsd
    • 135. StatsD
    • 136. StatsD• https://github.com/sivy/statsd-client
    • 137. StatsD• https://github.com/sivy/statsd-client use Net::StatsD::Client; my $c = Net::StatsD::Client->new(); $c->increment(endpoint.customer.app.metric); $c->timing(endpoint.customer.app.foo, 200);
    • 138. StatsD• https://github.com/sivy/statsd-client use Net::StatsD::Client; my $c = Net::StatsD::Client->new(); $c->increment(endpoint.customer.app.metric); $c->timing(endpoint.customer.app.foo, 200);• Too much activity? Sample it!
    • 139. StatsD• https://github.com/sivy/statsd-client use Net::StatsD::Client; my $c = Net::StatsD::Client->new(); $c->increment(endpoint.customer.app.metric); $c->timing(endpoint.customer.app.foo, 200);• Too much activity? Sample it! # sample 10%, StatsD will multiply it up $c->increment(endpoint.customer.app.metric, 0.1)
    • 140. That’s enough?
    • 141. Too much awesome?
    • 142. Too Bad!
    • 143. Logster
    • 144. Logster• Yet Another Etsy Project (YAEP)
    • 145. Logster• Yet Another Etsy Project (YAEP)• Rips metrics from log files
    • 146. Logster• Yet Another Etsy Project (YAEP)• Rips metrics from log files• https://github.com/etsy/logster
    • 147. Logster• Yet Another Etsy Project (YAEP)• Rips metrics from log files• https://github.com/etsy/logster # /usr/sbin/logster --dry-run --output=graphite --graphite-host=graphite.example.com:2003 SampleLogster /var/log/httpd/access_log
    • 148. Questions?
    • 149. Thank you