10. About graphite
● Django web application consisting of 3 parts:
○ carbon (relays, caches, aggregates metrics)
○ whisper (graphite’s equivalent of RRD files)
○ Web UI (graph composer, simple dashboard)
11. Why graphite?
12. Why graphing?
Discover trends and patterns
What time of the day do we get the most users?
When x happened, what was the effect on y?
How many hits am I getting per hour?
How does this compare to last week? last month?
Predict future events
When will we need to add more servers? Databases?
Did the release into production fix problem x?
13. Cacti SUCKS
A few reasons:
formulas, no graph introspection, cannot push metrics, cannot feed out of sequence
metrics, ugly graphs, no API, expose system/os metrics on host via snmp, no graph
composer, no custom graphs, predefine metrics, predefine graphs, static polling interval,
unscalable, tons of work to create one graph, no 3rd party ecosystem, etc.
(Nagios integration, 3rd party custom dashboards)
20. Easy to feed data
21. Wide ecosystem of 3rd party
tools and dashboards
28. Graphite --
29. No poller
30. No all in one solution
31. No easy backups
32. It probably will become
33. How to graph
34. There are tons of ways to
feed graphite your data
timestamp = `date +%s`
value = 10
echo "dot.delimited.metric.name $value $timestamp" | nc -w 1 graphite.
def send_msg(message, HOST, PORT):
sock = socket.create_connection((HOST, PORT))
Python using graphite-pymetrics
from metrics import timing
def heavy_task( x, y, z):
# do heavy stuff here
Host = 'somegraphitehost'
conn = TCPSocket.new Host, 2003
conn.puts 'Metrics value timestamp'
Socket conn = new Socket("somegraphitehost" , 2003);
DataOutputStream dos = new DataOutputStream(conn .getOutputStream());
dos.writeBytes("metrics value timestamp" );
37. How we use graphite
38. 700K + metrics per minute
39. A Common Graphite Stack
Agent for system/hardware level metrics
Growing repository of plugins for a wide variety
disk i/o, disk space, cpu, memory, mysql,
JMX, java, Redis, file sizes, load, etc.
Write your custom plugin in python
41. Nagios integration
You can write Nagios plugins that can alert off
of metrics values
Nagios can also feed graphite
performance data, events (ie: update
counter each time email is sent), etc.
42. What to collect?
43. Hardware/OS metrics
45. Disk space
46. Disk I/O
47. Network data
48. Application metrics
49. How often function x is called
50. Average value of function x
51. Average running time of
53. performance metrics
54. number of records with
value == ?
55. number of slow queries
58. send a 1, draw as infinite
59. Log files
60. http access logs
(2xx, 3xx, 4xx, 5xx)
61. Application logs
Exception counts, results, important events, hits
62. Final Musings
63. Treat graphite like ‘Big Data’
64. You don’t know what metrics
you need until you need it