Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
Pres perf human talks mars 2015
Next
Download to read offline and view in fullscreen.

1

Share

Download to read offline

Devoxx 2014 monitoring

Download to read offline

Slidedeck of the talk given at Devoxx in November 2014

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Devoxx 2014 monitoring

  1. 1. Monitoring ! Claude Falguière Valtech Paris #DV14 #Monitoring @cfalguiere
  2. 2. Content • DevOps is more than tooling! ! ! • Make you love Data! Individuals and interactions over processes and tools ! • Motivations for providing and collecting data! ! • Monitoring user stories and practices! ! • Getting started and open source tooling #DV14 #Monitoring @cfalguiere
  3. 3. Claude Falguiere • Devoxx4Kids! • Paris JUG, Devoxx France, Duchess! ! http://cfalguiere.wordpress.com ! • DevOps Coach ! • Java, Performance! #DV14 #Monitoring @cfalguiere
  4. 4. Monitoring What would you do if you knew that database is broken that number of hits doubles every 2 month that users struggle to find the order form why the app is slow what users want to buy #DV14 #Monitoring @cfalguiere
  5. 5. Model 138 ms 742 orders 42 users Questions Model Hypothesis Facts Data Sales increased by 14% Estimated orders next month 934 Average number of requests is 5 times the number of users #DV14 #Monitoring @cfalguiere
  6. 6. Galaxy Rotation Problem Spiral galaxies spin too fast ! ! Expected mass should be ten times the observed mass - calculated from the visible objets - to prevent galaxies from flying apart #DV14 #Monitoring @cfalguiere
  7. 7. Discovery of Dark Matter Assumes readings are wrong 1932 - 1933 1960 - 1970 2010 - 2013 Jan Oort Fritz Zwicky Hypothesis of a missing mass ?? If readings are true, is model wrong ? Mass calculated from gravitational effects and evidence of Dark Matter Plank Satellite Dark matter estimated to 84.5% of the total matter in the universe Vera Rubin #DV14 #Monitoring @cfalguiere
  8. 8. Measure everything Lean Startup DevOps Make decisions based on facts! Big Data #DV14 #Monitoring @cfalguiere
  9. 9. Measure everything Lean Startup DevOps Make decisions based on facts! Big Data #DV14 #Monitoring @cfalguiere
  10. 10. Measure everything Lean Startup DevOps Make decisions based on facts! Big Data #DV14 #Monitoring @cfalguiere
  11. 11. Measure everything Lean Startup DevOps Make decisions based on facts! Big Data #DV14 #Monitoring @cfalguiere
  12. 12. What would you do if you knew that database is broken that number of hits doubles every 2 month that users struggle to find the order form why the app is slow what users want to buy #DV14 #Monitoring @cfalguiere
  13. 13. Motivations and user stories SLA observance! Alerting Alerting Diagnosis / Post-Mortem! Capacity Planning! Improvement Storage, Visualization #DV14 #Monitoring @cfalguiere
  14. 14. Motivations and user stories SLA observance! Alerting Diagnosis / Post-Mortem! Capacity Planning! Improvement Alerting Storage, Visualization #DV14 #Monitoring @cfalguiere
  15. 15. Architecture Collector Probe App Alerting Storage, Aggregation Dev Log Parser Support Log Network System DBA Visualization #DV14 #Monitoring @cfalguiere
  16. 16. Architecture Collector Probe App Alerting Storage, Aggregation Dev Log Parser Support Log Network System DBA Visualization #DV14 #Monitoring @cfalguiere
  17. 17. Architecture Collector Probe App Alerting Storage, Aggregation Dev Log Parser Support Log Network System DBA Visualization #DV14 #Monitoring @cfalguiere
  18. 18. Collector System Collector App Log Storage MQ Storage Alerting filters rules MQ #DV14 #Monitoring @cfalguiere
  19. 19. Collector System Collector App Log Storage MQ Storage Alerting filters rules MQ #DV14 #Monitoring @cfalguiere
  20. 20. Topology App Platform App Monitoring Platform Alerting Collector! Visualization Log Parser Log Storage, Aggregation #DV14 #Monitoring @cfalguiere
  21. 21. Resilience App Platform App Monitoring Platform Alerting Collector Log Parser Log MQ Collector Storage, Aggregation Visualization MQ #DV14 #Monitoring @cfalguiere
  22. 22. What would you do if you knew that database is broken #DV14 #Monitoring @cfalguiere
  23. 23. Error detection and alerting • Log filtering ! • Event firing! ! • Context! • is it critical ?! • which feature does it impact ?! • how deep is the impact ? #DV14 #Monitoring @cfalguiere
  24. 24. Is this a log ? Exception in thread "main" com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: ! Access denied for user 'shopapp'@'shprdb1' to database 'shop'! at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)! at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)! at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)! at java.lang.reflect.Constructor.newInstance(Unknown Source)! at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)! at com.mysql.jdbc.Util.getInstance(Util.java:386)! at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1054)! at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4237)! at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4169)! at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:928)! at com.mysql.jdbc.MysqlIO.proceedHandshakeWithPluggableAuthentication(MysqlIO.java:1750)! at com.mysql.jdbc.MysqlIO.doHandshake(MysqlIO.java:1290)! at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2493)! at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2526)! at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2311)! at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:834)! at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)! at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)! at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)! at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)! at java.lang.reflect.Constructor.newInstance(Unknown Source)! at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)! at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:416)! at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:347)! at java.sql.DriverManager.getConnection(Unknown Source)! #DV14 #Monitoring @cfalguiere
  25. 25. Log example 2013-12-17 05:53:16,208 ERROR [Order Creation Service](456713) [shpras2](web1234) Could not create order id=456713 - Cause: Can’t connect to database ‘shop” - MySqlMessage: Access denied for user 'shopapp'@'shprdb1' to database 'shop'! 2013-12-17 05:53:16,208 ! ERROR ! [Order Creation Service]! (456713) ! [shpras2]! (web1234) ! Could not create order id=456713 ! Cause: Can’t connect to database ‘shop” ! MySqlMessage: Access denied for user 'shopapp'@'shprdb1' to database 'shop'! #DV14 #Monitoring @cfalguiere
  26. 26. Timestamp 2013-12-17 05:53:16,208 ! ERROR ! [Order Creation Service]! (456713) ! [shpras2]! (web1234) ! Could not create order id=456713 ! Cause: Can’t connect to database ‘shop” ! MySqlMessage: Access denied for user 'shopapp'@'shprdb1' to database 'shop'! Severity }Context (technical and business) { Meaningful information #DV14 #Monitoring @cfalguiere
  27. 27. Log Collectors Collector Collectd Logstash storage Log Alerting! System Flume Splunk (Commercial) #DV14 #Monitoring @cfalguiere
  28. 28. Logstash input {! file {! path => “/app/logs/apache/*.log”! type => "apachelog"! }! }! ! filter {! if [type] == "apachelog" {! grok {! pattern => “%{COMBINEDAPACHELOG}" ! }! }! }! ! output {! elasticsearch { host => localhost } ! stdout { }! } #DV14 #Monitoring @cfalguiere
  29. 29. Logstash input {! file {! path => “/app/logs/appserver/monitor*.log"! type => "applog"! }! }! ! filter {! if [type] == "applog" {! grok {! pattern => “%{TIMESTAMP_ISO8601:ts}” %{WORD}:severity …! }! }! }! ! output {! elasticsearch { host => localhost } ! stdout { }! } #DV14 #Monitoring @cfalguiere
  30. 30. Rate check • Frequency of an error increases! • Activity falls (e.g. Frequency of orders)! ! • Alerting based on threshold #DV14 #Monitoring @cfalguiere
  31. 31. Baselining 120 90 60 30 0 A B 10:00 10:10 10:20 10:30 10:40 10:50 11:00 11:10 200 150 100 50 0 D C 10:00 10:10 10:20 10:30 10:40 10:50 11:00 11:10 #DV14 #Monitoring @cfalguiere
  32. 32. What would you do if you knew that number of hits doubles every 2 month 120 90 60 30 0 Jan Feb Mar Apr May Jun Jul Aug #DV14 #Monitoring @cfalguiere
  33. 33. Graphers 70 52,5 35 17,5 0 30 22,5 15 7,5 0 • Foresight Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec • Cycles Sun Tue Thu Sat Mon Wed Wed 40 30 20 10 0 • Correlation Sun Tue Thu Sat Mon WedWed • Distribution 100 75 50 25 0 April May June July #DV14 #Monitoring @cfalguiere
  34. 34. Storage / Visualization Collectors (Collectd / Statd / Logstash / Flume) Plain REST Graphite docker: lopter/collectd-graphite #DV14 #Monitoring @cfalguiere
  35. 35. Collect and Share Collect Once and Share! • Support, ! • Ops, Dev! • Business! ! UpToDate! Flexible! ! #DV14 #Monitoring @cfalguiere
  36. 36. Storage / Visualization Collectors (Collectd / Statd / Logstash / Flume) Plain REST REST REST Graphite InfluxDB Grafana docker: gsogol/docker-elk Logstash ElasticSearch Kibana #DV14 #Monitoring @cfalguiere
  37. 37. JMX source: wikipedia • MBeans! • Registration! • Servo! • RMI and firewalls! • -Dcom.sun.management.jmxremote.rmi.port=p! • -Djava.rmi.server.hostname=n.n.n.n! • Jolokia! • jmxtrans! ! #DV14 #Monitoring @cfalguiere
  38. 38. JMX Collectors storage Collector logstash collectd JMX beans VisualVM! JConsole JMX Enabled! ! App Performance Monitoring tools #DV14 #Monitoring @cfalguiere
  39. 39. JSON Event over REST curl -X POST “…” ! Timestamp -d '{"ts": "2013-12-17 05:53:16,208", ! ! "type": “metric”, ! ! “module”: “Order Creation Service”, ! ! “module-id”: “456713”, ! ! “instance”: “shpras2”, ! ! “thread”: “web1234”, ! “name”: “order-creation”,! ! “duration”: “12”, ! ! “unit”: “ms”} } Context (technical and business) } Metric) #DV14 #Monitoring @cfalguiere
  40. 40. What would you do if you knew why app is slow #DV14 #Monitoring @cfalguiere
  41. 41. Tuning • Collectd/Statd plugins! • Metrics ! • Commercial : Plumbr, AppDynamics, New Relics! ! ! Where does it spend time ?! Why ? cross-check metrics from various sub-systems Front-End Back-End System DB System System #DV14 #Monitoring @cfalguiere
  42. 42. What would you do if you knew that users struggle to find the order form #DV14 #Monitoring @cfalguiere
  43. 43. Web Analytics / User tracking • Web analytics! • Page counters! • Tagging! • Log parser! ! • Google Analytics! • Piwik (docker: cfalguiere/docker-piwik) • Reporting APIs #DV14 #Monitoring @cfalguiere
  44. 44. What would you do if you knew what users want to buy #DV14 #Monitoring @cfalguiere
  45. 45. Model vs Big Data • Expected information! • Explicit Model! • List of metrics • Classification! • Machine Learning! • Patterns detection! Highlights valuable metrics and relationships #DV14 #Monitoring @cfalguiere
  46. 46. Getting started List user stories and metrics setup monitoring get facts get add metrics hypothesis validate hypothesis get facts #DV14 #Monitoring @cfalguiere
  47. 47. What should I monitor ? Alerting & Post-Mortem :! Presence check Activity (how many users, requests, orders …) Ressources that are limited in size Physical : CPU, memory, free disk space, network bandwidth ... Logical : pools, queues, caches, … Errors Others #DV14 #Monitoring @cfalguiere
  48. 48. What should I monitor ? Plan & Improve :! Any information which is useful to understand the process time spent for each major step things that are done often or requires large datasets user navigation context Listen to users and ops #DV14 #Monitoring @cfalguiere
  49. 49. Learn from data Continuous Improvement Design for Failure #DV14 #Monitoring @cfalguiere
  50. 50. Thank You #DV14 #Monitoring @cfalguiere
  • henri.gomez

    Nov. 18, 2014

Slidedeck of the talk given at Devoxx in November 2014

Views

Total views

2,484

On Slideshare

0

From embeds

0

Number of embeds

1,281

Actions

Downloads

37

Shares

0

Comments

0

Likes

1

×