Metrics with Ganglia

8,576 views
8,329 views

Published on

Talk about using Ganglia and other tools for storing all kinds of web application metrics for both operations and business purposes. Presented at Cambridge Geek Night

Published in: Technology
0 Comments
11 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
8,576
On SlideShare
0
From Embeds
0
Number of Embeds
925
Actions
Shares
0
Downloads
115
Comments
0
Likes
11
Embeds 0
No embeds

No notes for slide

Metrics with Ganglia

  1. 1. Collecting MetricsWith Ganglia and FriendsCambridge Geek Night 28th March 2011gareth rushgrove | morethanseven.net http://www.flickr.com/photos/memestate/45986749
  2. 2. Gareth Rushgrovegareth rushgrove | morethanseven.net
  3. 3. freeagentcentral.comWork at FreeAgentgareth rushgrove | morethanseven.net
  4. 4. Blog at morethanseven.netgareth rushgrove | morethanseven.net
  5. 5. Curate devopsweekly.comgareth rushgrove | morethanseven.net
  6. 6. - Capacity planning metrics - Metrics for your application - Business analytics - Having everything in one placeCovering (Business Version)gareth rushgrove | morethanseven.net
  7. 7. - Ganglia Store metrics and view graphs - Logster Get log files into Ganglia - Gmetric Get anything into Ganglia - Syslog Using Loggly to view individual log itemsCovering (Tech Version)gareth rushgrove | morethanseven.net
  8. 8. Everyone Uses Something Like?gareth rushgrove | morethanseven.net
  9. 9. Use Something Like This Toogareth rushgrove | morethanseven.net
  10. 10. “Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. ganglia.sourceforge.netWhat is Ganglia?gareth rushgrove | morethanseven.net
  11. 11. Example: vagrantbox.esgareth rushgrove | morethanseven.net
  12. 12. Load Averagesgareth rushgrove | morethanseven.net
  13. 13. CPUgareth rushgrove | morethanseven.net
  14. 14. Aggregate Graphsgareth rushgrove | morethanseven.net
  15. 15. Across Entire Clustergareth rushgrove | morethanseven.net
  16. 16. “A strategy for anticipating future workloads of your computers, with the aim of creating a computing environment that can handle future workload IBMPredicting When Your System Will Failgareth rushgrove | morethanseven.net
  17. 17. Disk Spacegareth rushgrove | morethanseven.net
  18. 18. Monitoring Your Applicationgareth rushgrove | morethanseven.net
  19. 19. 86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.1" 200 2081 "-" "Mozilla/5.0(Macintosh; U; Intel Mac OS X 10_6_7; en-us) AppleWebKit/533.20.25 (KHTML, like Gecko)Version/5.0.4 Safari/533.20.27"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"Web Server Logs86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5970 "-" "FunkLoad/1.14.0"86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"gareth rushgrove | morethanseven.net86.26.7.33 - - [26/Mar/2011:20:39:53 +0000] "GET / HTTP/1.0" 200 5466 "-" "FunkLoad/1.14.0"
  20. 20. Logster from Etsygareth rushgrove | morethanseven.net
  21. 21. Tail a log file and filter each line to generate metrics that can be sent to common monitoring packages. Options: -p METRIC_PREFIX, --metric-prefix=METRIC_PREFIX Add prefix to all published metrics. This is for people that may multiple instances of same service on same host. --gmetric-options=GMETRIC_OPTIONS Options to pass to gmetric such as -d 180 -c /etc/ganglia/gmond.conf (default). These are passed directly to gmetric. --graphite-host=GRAPHITE_HOST Hostname and port for Graphite collector, e.g. graphite.example.com:2003 -s STATE_DIR, --state-dir=STATE_DIR Where to store the logtail state file. Default location /var/run -d, --dry-run Parse the log file but send stats to standard output. -D, --debug Provide more verbose logging for debugging.Logstergareth rushgrove | morethanseven.net
  22. 22. logster SampleGangliaLogster /../access.logLogster Command Linegareth rushgrove | morethanseven.net
  23. 23. HTTP Responses with a 2xx Status Codegareth rushgrove | morethanseven.net
  24. 24. The Ganglia Metric Client (gmetric) announces a metric on the list of defined send channels defined in a configuration file Usage: gmetric [OPTIONS]... -V, --version Print version and exit -c, --conf=STRING The configuration file to use for finding send channels (default=/etc/ganglia/gmond.conf) -n, --name=STRING Name of the metric -v, --value=STRING Value of the metric -t, --type=STRING Either string|int8|uint8|int16|uint16|int32|uint32|float|double -u, --units=STRING Unit of measure for the value e.g. Kilobytes, Celcius (default=) -s, --slope=STRING Either zero|positive|negative|both (default=both) -x, --tmax=INT The maximum time in seconds between gmetric calls (default=60) -d, --dmax=INT The lifetime in seconds of this metric (default=0) -S, --spoof=STRING IP address and name of host/device (colon separated) we are spoofing (default=) -H, --heartbeat spoof a heartbeat message (use with spoof option)Gmetricgareth rushgrove | morethanseven.net
  25. 25. Gmetric Scripts for Common Applicationsgareth rushgrove | morethanseven.net
  26. 26. gmetric -n sales -v 200 -t floatGmetric Command Linegareth rushgrove | morethanseven.net
  27. 27. Our Custom Metric in Gangliagareth rushgrove | morethanseven.net
  28. 28. import subprocess from bottle import route, run, abort, default_app @route(/:name/:value) def index(name, value): try: cmd = gmetric -n %s -v %s -t float % (name, value) subprocess.check_call( cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) return "Success: %s" % cmd except subprocess.CalledProcessError: abort(500, "Error") app = default_app()Gmetric HTTP Interfacegareth rushgrove | morethanseven.net
  29. 29. http://../sales/200Gmetric URLgareth rushgrove | morethanseven.net
  30. 30. import subprocess import SocketServer class GmetricTCPHandler(SocketServer.BaseRequestHandler): def handle(self): self.data = self.request.recv(1024).strip() items = self.data.split( ) try: cmd = gmetric -n %s -v %s -t float % (items[0], items[1]) subprocess.check_call( cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) return "Success: %s" % cmd except Exception: return "Error" if __name__ == "__main__": HOST, PORT = "0.0.0.0", 8001 server = SocketServer.TCPServer((HOST, PORT), GmetricTCPHandler) server.serve_forever()Gmetric TCP Interfacegareth rushgrove | morethanseven.net
  31. 31. sales 200Gmetric TCPgareth rushgrove | morethanseven.net
  32. 32. “Syslog is a standard for logging program messages. It allows separation of the software that generates messages from the system that stores them and the software that reports and analyzes them. WikipediaSysloggareth rushgrove | morethanseven.net
  33. 33. Loggly - Logging as a Servicegareth rushgrove | morethanseven.net
  34. 34. View logsgareth rushgrove | morethanseven.net
  35. 35. Logstashgareth rushgrove | morethanseven.net
  36. 36. Graylog2gareth rushgrove | morethanseven.net
  37. 37. - Database table sizes - Cache hits - Time taken for test runs - Codebase size - Signups, sales, subscriptions - Twitter followersOther Things You Could Monitorgareth rushgrove | morethanseven.net
  38. 38. - Wikipedia http://ganglia.wikimedia.org/ - Install Ganglia deb and rpm packages available - Add system metrics web servers, databases - Add business metrics users, sales, tweets - Try Loggly or at least investigate syslogWhat Next?gareth rushgrove | morethanseven.net
  39. 39. Readinggareth rushgrove | morethanseven.net
  40. 40. CBGN112 months free on FreeAgentgareth rushgrove | morethanseven.net
  41. 41. Questions?gareth rushgrove | morethanseven.net http://flickr.com/photos/psd/102332391/

×