Metrics by coda hale : to know your app’ health

23,456 views

Published on

Nowadays when developers required to be aligned with operations it’s quite important to have common understanding of how application is performing in production. I believe quite small amount of developers are really care/think about operation of the app. In this talk I’m going to describe how it’s easy to provide performance information of application in production with Metrics by Coda Hale and to share practical use cases.

Published in: Software, Technology
0 Comments
35 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
23,456
On SlideShare
0
From Embeds
0
Number of Embeds
1,178
Actions
Shares
0
Downloads
177
Comments
0
Likes
35
Embeds 0
No embeds

No notes for slide

Metrics by coda hale : to know your app’ health

  1. 1. Metrics by Coda Haleto know your app’ health Izzet Mustafayev@EPAM Systems @webdizz webdizz izzetmustafaiev http://webdizz.name
  2. 2. this is me ● SA at EPAM Systems ● primary skill is Java ● JUG member/speaker ● hands-on-coding with Ruby, Groovy and some Scala ● passionate about agile, clean code practices and devops movement
  3. 3. agenda ● introduction ● why should i care ● features ● reporting ● recipes ● alternatives ● q&a
  4. 4. Finish or Start?
  5. 5. Familiar?
  6. 6. Possible?!.
  7. 7. Your app?!.
  8. 8. Dreams...
  9. 9. One thing...
  10. 10. ● does my app work?
  11. 11. ● does my app work? ● will my app keep working?
  12. 12. ● does my app work? ● will my app keep working? ● what happened that made my app stop working?
  13. 13. Statistics
  14. 14. Metrics
  15. 15. Features
  16. 16. Registry. ..- is the heart of metrics framework MetricRegistry registry = new MetricRegistry();​ // or @Bean public MetricRegistry metricRegistry(){ return new MetricRegistry(); }
  17. 17. Gauge... - is the simplest metric type that just returns a value registry.register( name(Persistence.class, "entities-cached"), new Gauge<Integer>() { public Integer getValue() { return cache.getItems(); } } );​
  18. 18. Counter... - is a simple incrementing and decrementing integer value Counter onSiteUsers = registry.counter( name(User.class, "users-logged-in")); // on login increase onSiteUsers.inc();​ // on logout/session expiration decrease onSiteUsers.dec();​
  19. 19. Histogram. ..- measures the distribution of values in a stream of data Histogram requestTimes = registry.histogram( name(Request.class, "request-processed") ); requestTimes.update(request.getTime()); ​
  20. 20. Meter... - measures the rate at which a set of events occur Meter timedOutRequests = registry.meter( name(Request.class, "request-timed-out") ); // in finally block on time-out timedOutRequests.mark();
  21. 21. Timer... - is basically a histogram of the duration of a type of event and a meter of the rate of its occurrence ​Timer requests = registry.timer( name(Request.class, "request-processed") ); Timer.Context time = requests.time(); // in finally block of request processing time.stop();​
  22. 22. Health Check.. - is a unified way of performing application health checks public class BackendHealthCheck extends HealthCheck { private Backend backend; protected Result check() { if (backend.ping()) { return Result.healthy(); } return Result.unhealthy("Can't ping backend"); } }
  23. 23. Reporting
  24. 24. JMX // somewhere in your application​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​​ JmxReporter reporter = JmxReporter.forRegistry(registry).build(); reporter.start();​
  25. 25. HTTP ● AdminServlet ○ MetricsServlet ○ ThreadDumpServlet ○ PingServlet ○ HealthCheckServlet
  26. 26. Console ConsoleReporter reporter = ConsoleReporter.forRegistry (registry) .convertRatesTo(TimeUnit.SECONDS) .convertDurationsTo(TimeUnit.MILLISECONDS) .build(); reporter.start(1, TimeUnit.MINUTES);​ Slf4jReporter reporter = Slf4jReporter.forRegistry(registry) .outputTo(LoggerFactory.getLogger("com.app.metrics")) .convertRatesTo(TimeUnit.SECONDS) .convertDurationsTo(TimeUnit.MILLISECONDS) .build(); reporter.start(1, TimeUnit.MINUTES);​
  27. 27. Ganglia & Graphite
  28. 28. Naming by Matt Aimonetti <namespace>.<instrumented section>.<target (noun)>. <action (past tense verb)> Example ● customers.registration.validation.failed Nesting ● customers.registration.validation.failure. similar_email_found ● customers.registration.validation.failure. email_verification_failed
  29. 29. AOP There is an easy way to introduce metrics using AOP. <aop:config proxy-target-class="false"> <aop:aspect id="controllerProfilerAspect" ref="executionMonitor"> <aop:pointcut id="controllerMethods" expression="execution(* com.app.controller.*Controller.*(..))" /> <aop:around pointcut-ref="controllerMethods" method="logExecutionTime" /> </aop:aspect> </aop:config> ​
  30. 30. public Object logExecutionTime(ProceedingJoinPoint joinPoint) throws Throwable { String namePfx = metricNamePrefix(joinPoint); Timer.Context time = registry.timer( name(namePfx, "timed")).time(); try { // execute joint point ... } catch(Throwable throwable) { registry.counter( name(namePfx, "exception-counted")).inc(); throw throwable; } finally { time.stop() } } ​
  31. 31. metrics-spring @Configuration @EnableMetrics public class ApplicationConfiguration { @Bean public MetricRegistry metricRegistry(){ return new MetricRegistry(); } // your other beans here ... } // annotations @Timed, @Metered, @ExceptionMetered, @Counted​, @Gauge, @CachedGauge and @Metric
  32. 32. instrumentation ● JVM ● Jetty ● Jersey ● InstrumentedFilter​ ● InstrumentedEhcache​ ● InstrumentedAppender (Log4j and Logback) ● InstrumentedHttpClient
  33. 33. Alternatives
  34. 34. https://github.com/twitter/zipkin Zipkin is a distributed tracing system that helps Twitter gather timing data for all the disparate services. Zipkin provides three services: ● to collect data: bin/collector ● to extract data: bin/query ● to display data: bin/web
  35. 35. http://www.moskito.org/ MoSKito is a ● Multi-purpose: collects any possible type of performance data, including business-related ● Non-invasive: does not require any code change ● Interval-based: works simultaneously with short & long time intervals, allowing instant comparison ● Data privacy: keeps collected data locally, with no 3rd party resources involved. ● Analysis tools: displays accumulated performance data in charts. Live profiling: records user actions as system calls.
  36. 36. https://code.google.com/p/javasimon/ Java Simon is a simple monitoring API that allows you to follow and better understand your application. Monitors (familiarly called Simons) are placed directly into your code and you can choose whether you want to count something or measure time/duration.
  37. 37. https://github.com/Netflix/servo Servo goal is to provide a simple interface for exposing and publishing application metrics in Java. ● JMX: JMX is the standard monitoring interface for Java ● Simple: It is trivial to expose and publish metrics ● Flexible publishing: Once metrics are exposed, it is easy to regularly poll the metrics and make them available for reporting systems, logs, and services like Amazon’s CloudWatch.
  38. 38. Summary
  39. 39. getting started ● http://metrics.codahale.com/ ● http://matt.aimonetti. net/posts/2013/06/26/practical-guide-to- graphite-monitoring/ ● http://www.ryantenney.com/metrics-spring/
  40. 40. q&a
  41. 41. ThanksIzzet Mustafayev@EPAM Systems @webdizz webdizz izzetmustafaiev http://webdizz.name

×