Graphite, an introduction
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Graphite, an introduction

on

  • 693 views

An introduction to graphite and why it's so great.

An introduction to graphite and why it's so great.

Statistics

Views

Total Views
693
Views on SlideShare
677
Embed Views
16

Actions

Likes
3
Downloads
20
Comments
0

1 Embed 16

http://www.slideee.com 16

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Graphite, an introduction Presentation Transcript

  • 1. Graphite: An Introduction Scaling real-time monitoring
  • 2. The purpose today
  • 3. What is graphite
  • 4. Why it’s so great
  • 5. How to graph (It’s really easy!)
  • 6. How we use graphite
  • 7. First, a definition
  • 8. Alerts+Metrics=Monitoring Graphite Cacti Munin Nagios Icinga Both Zenoss Hyperic Zabbix PNP4Nagios Alerting Metrics
  • 9. What is graphite
  • 10. About graphite ● Django web application consisting of 3 parts: ○ carbon (relays, caches, aggregates metrics) ○ whisper (graphite’s equivalent of RRD files) ○ Web UI (graph composer, simple dashboard)
  • 11. Why graphite?
  • 12. Why graphing? Discover trends and patterns What time of the day do we get the most users? When x happened, what was the effect on y? How many hits am I getting per hour? How does this compare to last week? last month? Predict future events When will we need to add more servers? Databases? Negative feedback Did the release into production fix problem x?
  • 13. Cacti SUCKS A few reasons: Ancient user interface (no javascript/ajax), terrible workflow, cannot push metrics, no formulas, no graph introspection, cannot push metrics, cannot feed out of sequence metrics, ugly graphs, no API, expose system/os metrics on host via snmp, no graph composer, no custom graphs, predefine metrics, predefine graphs, static polling interval, unscalable, tons of work to create one graph, no 3rd party ecosystem, etc.
  • 14. Graphite ++
  • 15. Simple
  • 16. Powerful
  • 17. Functions (sum, derivatives, integrals, timeshift, mostDeviant, scale, averages, etc.)
  • 18. API (Nagios integration, 3rd party custom dashboards)
  • 19. Scalable
  • 20. Easy to feed data
  • 21. Wide ecosystem of 3rd party tools and dashboards http://graphite.readthedocs.org/en/latest/tools.html
  • 22. Tools
  • 23. StatsD
  • 24. Logster
  • 25. Skyline
  • 26. Collectd
  • 27. Dashboards
  • 28. Graphite --
  • 29. No poller
  • 30. No all in one solution
  • 31. No easy backups
  • 32. It probably will become business critical
  • 33. How to graph
  • 34. There are tons of ways to feed graphite your data
  • 35. Bash #!/bin/bash timestamp = `date +%s` value = 10 echo "dot.delimited.metric.name $value $timestamp" | nc -w 1 graphite. host.name 2003 Python def send_msg(message, HOST, PORT): sock = socket.create_connection((HOST, PORT)) sock.send(message) sock.close() Python using graphite-pymetrics from metrics import timing @timing("heavy.task") def heavy_task( x, y, z): # do heavy stuff here
  • 36. Ruby require 'socket' Host = 'somegraphitehost' conn = TCPSocket.new Host, 2003 conn.puts 'Metrics value timestamp' conn.close Java import java.io.DataOutputStream; import java.net.Socket; Socket conn = new Socket("somegraphitehost" , 2003); DataOutputStream dos = new DataOutputStream(conn .getOutputStream()); dos.writeBytes("metrics value timestamp" ); conn.close();
  • 37. How we use graphite
  • 38. 700K + metrics per minute
  • 39. A Common Graphite Stack Graphite-web Collectd Poller(s) Applications Carbon Whisper Dashboards Statsd Scripts Nagios
  • 40. Collectd Agent for system/hardware level metrics Growing repository of plugins for a wide variety of applications: disk i/o, disk space, cpu, memory, mysql, JMX, java, Redis, file sizes, load, etc. https://collectd.org/wiki/index.php/Table_of_Plugins Write your custom plugin in python
  • 41. Nagios integration You can write Nagios plugins that can alert off of metrics values Nagios can also feed graphite performance data, events (ie: update counter each time email is sent), etc.
  • 42. What to collect?
  • 43. Hardware/OS metrics
  • 44. Load
  • 45. Disk space
  • 46. Disk I/O
  • 47. Network data
  • 48. Application metrics
  • 49. How often function x is called
  • 50. Average value of function x
  • 51. Average running time of function x
  • 52. Database/Datastore
  • 53. performance metrics
  • 54. number of records with value == ?
  • 55. number of slow queries
  • 56. Events
  • 57. Deployments
  • 58. send a 1, draw as infinite
  • 59. Log files
  • 60. http access logs (2xx, 3xx, 4xx, 5xx)
  • 61. Application logs Exception counts, results, important events, hits
  • 62. Final Musings
  • 63. Treat graphite like ‘Big Data’
  • 64. You don’t know what metrics you need until you need it
  • 65. Get Raid 10 SSD’s once you decide to scale
  • 66. More devopsy
  • 67. You can start graphing today!