Your SlideShare is downloading. ×
Graphite at CityGrid - LA DevOps April 2014
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Graphite at CityGrid - LA DevOps April 2014

195
views

Published on

High-level description of CityGrid's use of Graphite for collecting/displaying metrics, along with some interesting use-cases.

High-level description of CityGrid's use of Graphite for collecting/displaying metrics, along with some interesting use-cases.

Published in: Technology, Business

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
195
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Graphite at CityGrid if you can’t measure it, you can’t fix it Wil Heitritter Director, Tech Ops Los Angeles DevOps 2014/04/28
  • 2. Magnum esse solem philosophus probabit, quantus sit mathematicus -Seneca
  • 3. Objectives - Introduce Graphite to new users - Show what we like, what we hate - Present some interesting use-cases - Generate discussion
  • 4. Before Graphite Ganglia • Predictable interface • Text “metrics” to store versions • Slow • Couldn’t pick and choose metrics to see
  • 5. Why ganglia sucked - Clusters had to be pre-configured - Multicast vs. Unicast - Data Retention - Static Web Interface (can’t pick and choose) - Static Host List
  • 6. What did we think wanted? Ease of adding metrics Ease of sending metrics Powerful metric display Retain ganglia-style cluster dashboards Long-term configurable metric retention
  • 7. Graphite!
  • 8. What is Graphite? a highly scalable real-time graphing system which collects numeric time-series data is managed by carbon and stored as whisper files and visualized through web interfaces or queried via the API http://graphite.wikidot.com/
  • 9. Graphite: what we like Sending metrics is simple Retrieving metrics is simple Dashboard creation and sharing… is simple Many functions() 120MM+ metric values received daily Backfilling past metrics is simple Expandable - different frontends
  • 10. Graphite: what sucks Dashboard ownership/promotion No ganglia-like standard dashboard Data retention… is NOT as simple as we thought
  • 11. CityGrid’s Graphite Implementation
  • 12. Metric Naming Business Metrics - These are metrics that are not specific to a specific server - Format: business.${hierarchical}.${path}.${here}.$metric - Example: business.ec2.testaccount.us-east-1a.OnDemand.running.m2.4xlarge
  • 13. Metric Naming Server Metrics - These metrics are specific to a particular server (just like ganglia) - Format: servers.${class}.${f_q_d_n}.${metric} - Example: servers.rvw.aws1prdrvw1_subdom_cityg_com.LW_api_reviews_QPS
  • 14. Sending metrics Sending directly from metric scripts - /etc/graphite.conf - May need to spread out sending if in volume Collecting from gmond every minute - Metrics are spread out to prevent spiking - False data (gmond acts as a cache)
  • 15. Impact of staggered sending
  • 16. Sending is simply... echo $metric $value $timestamp | nc $relay $port
  • 17. Performance carbon-cache/carbon-relay SSD replication within minutes
  • 18. Maintenance Changing retention - whisper-auto-resize.py Filling holes - whisper-fill $source $destination Backups - Dashboards - Metrics
  • 19. Graphite Use-Cases
  • 20. Single Metric
  • 21. Combined Metrics
  • 22. Key Metrics Dashboard Examples of Key Metrics - QPS - Processing Time (Max/Mean/Distribution) - Metrics about sub-requests - Network usage - CPU/load
  • 23. Key Metrics Dashboard
  • 24. Nagios Integration check_graphite_target!highestMax( servers.mai.@HOSTNAME@.LW_map_return_code_5*_ratio, 1 )!5!10
  • 25. How about Pie Charts?
  • 26. Ad-Hoc Dashboards Demo
  • 27. What NOT to do
  • 28. Trying it out for yourself
  • 29. Quick Setup Install & Start # pip install https://github.com/graphite-project/ceres/tarball/master # pip install whisper # pip install carbon # pip install graphite-web start it up... send it a metric: echo business.test.metric1 1 `date “+%s”` | nc localhost 2003 OK, it’s almost that easy...
  • 30. Discussion