Resin Health SystemBeyond Java Monitoring and Server Monitoring                                                           ...
Gartner names Caucho                                                                                        in "Cool Vendo...
Resin	  Health	  System	  (RHS)	  Overview•    Resin Health System (RHS)•    Goes Beyond Just Monitoring Server and JVM•  ...
RHS	  :	  Reliability	  and	  System	  TransparencyCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / We...
RHS	  born	  from	  need•    Idea for RHS came from     doing Resin support•    Thread lock? Can you do a     thread dump ...
RHS	  By	  Engineers	  for	  EngineersCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server     ...
Major	  features	  of	  Resin	  Health	  System	  (RHS)•    Ability to respond to problems                                ...
RHS	  Tracks	  Metrics•    Metrics are things like Available Memory, Number of     Requests Per Minute, Garbage Collection...
VisualizaFon•    You can view data that Health System collects    •     Resin Web Admin    •     Watchdog Report        • ...
RHS	  and	  Web	  AdminCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server         Copyright (...
RHS:	  Health	  Checks•    RHS is highly configurable•    Similar to the Resins "URL     Rewrite" rules•    Rules are config...
Watchdog	  process                                                                       •     Lightweight process : Used ...
Watchdog	  Non	  Stop	  ModeCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server         Copyri...
Watchdog	  Non	  Stop	  Mode•    Resin is resilient•    If a Denial of Service or unexpected Spike or     Bug knocks down ...
Resin Watch-Dog                      Watchdog ProcessCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / ...
Resin Watch-Dog                      Watchdog ProcessCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / ...
Resin Watch-Dog          Starting           Resin                                                    Process              ...
Resin Watch-Dog     Non-Stop Up        State                                                    Process                   ...
Resin Watch-Dog     Non-Stop Up        State                                                    Process                   ...
Resin Watch-Dog     Non-Stop Up        State                      Watchdog ProcessCaucho Home | Contact Us | Caucho Blog |...
Internal	  Watchdog	  Thread	  Inside	  of	  Resin Watchdog Process                                                       ...
Internal	  Watchdog	  Thread	  Inside	  of	  Resin Watchdog Process                                                       ...
Internal	  Watchdog	  Health	  Thread•   Runs inside of Resin Server•   Runs periodically    •   Collects data    •   Coll...
Resin	  Java	  CDI	  /	  CanDI	  and	  Resin	  Conf	  based•    RHS configuration extends Resin configuration file     resin....
Java	  Doc	  /	  XML	  conf	  of	  RHS•    Startup delay : wait for     baselined date before     recording•    Period: ho...
Types	  of	  Health	  ChecksCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server         Copyri...
Health	  Checks	  produce	  StatusCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server         ...
Resin	  Checks	  and	  RespondsCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server         Cop...
Health	  System	  AcFonsCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server         Copyright ...
AcFons	  based	  on	  condiFon•    Actions can be     grouped•    If in critical state for     two minutes perform     gro...
Collect	  data	  needed	  to	  diagnose	  •    When something goes wrong                                                  ...
AcFons	  beQer	  than	  just	  watchingCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server    ...
Watch	  dog	  report	  (PDF)• Post Mortem Report• Environment Info• Server Metrics• JVM Metrics• Thread Dump• Heap Dump• M...
Watch	  dog	  report	  (PDF)• Post Mortem Report• Environment Info• Server Metrics• JVM Metrics• Thread Dump• Heap Dump• M...
Environment	  Data•    Collect critical information     about environment•    When,•    What OS,•    What version of Resin...
Health	  Status•     Status of Health Checks in Report    Caucho Home | Contact Us | Caucho Blog | Wiki | Application Serv...
Recent	  Errors	  and	  Warnings•     Recent Errors and Warnings    Caucho Home | Contact Us | Caucho Blog | Wiki | Applic...
Anomalies•     Health Checking stores baseline•     Anomalies are configurable triggers based on large changes from      ex...
Understanding	  Anomaly	  DetecFonCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server         ...
Understanding	  Anomaly	  DetecFonCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server         ...
Types	  of	  Metric	  Graphs	  in	  Report•    Cluster Status                                                   •    Datab...
Sample	  Graphs	  Memory	  and	  GC	  TimeCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server ...
Sample	  Metric	  Graphs	  RequestCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server         ...
GC	  and	  Memory	  MetricsCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server         Copyrig...
Heap	  Dump•    Heap dump critical for tracking down memory leaks•    Also generates hprof file which can be analyzed by ma...
CPU	  Profile	  /	  Thread	  Dumps•     Critical for debugging thread deadlock issues    Caucho Home | Contact Us | Caucho ...
Snapshot	  report•    Reports same type of data as     watchdog•    Watchdog report is a post-     mortem analysis•    Sna...
ConclusionCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server         Copyright (c) 1998-2012 ...
More	  Background	  Info	  About	  Health	  System•    Resin Health System : Java Monitoring and Server     Monitoring bui...
More	  Info•    Caucho Technology | Home Page•    Resin | Application Server•    Resin | Java EE Web Profile Application Se...
Resin	  Java	  ApplicaFon	  ServerCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server         ...
Upcoming SlideShare
Loading in …5
×

Resin | Application Server Health System | Java Monitoring

6,341 views

Published on

http://www.caucho.com/resin-application-server/server-monitoring-watchdog-health-system/

Resin application server has a health system that does java monitoring and server monitoring.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,341
On SlideShare
0
From Embeds
0
Number of Embeds
4,186
Actions
Shares
0
Downloads
22
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Resin | Application Server Health System | Java Monitoring

  1. 1. Resin Health SystemBeyond Java Monitoring and Server Monitoring Health Checks, Watchdog and Snapshot Report Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. caucho® , resin® and quercus® are registered trademarks of Caucho Technology, Inc.
  2. 2. Gartner names Caucho in "Cool Vendors in Platform Java EE Certified and Integration Middleware"Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved. caucho® , resin® and quercus® are registered trademarks of Caucho Technology, Inc.
  3. 3. Resin  Health  System  (RHS)  Overview• Resin Health System (RHS)• Goes Beyond Just Monitoring Server and JVM• can respond to conditions with actions• Actions can remediate problems• If server about to go down • due to bug, denial of service, or spike • RHS triggers diagnostics then restarts • Resin Application Server keeps runningCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  4. 4. RHS  :  Reliability  and  System  TransparencyCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  5. 5. RHS  born  from  need• Idea for RHS came from doing Resin support• Thread lock? Can you do a thread dump when you see the problem?• Running out of memory? Can you do a heap dump?• How is your machine configured? What version? • What OS? Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  6. 6. RHS  By  Engineers  for  EngineersCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  7. 7. Major  features  of  Resin  Health  System  (RHS)• Ability to respond to problems • External Monitoring• Detect JVM and OS issues • Resin WatchDog Process• Avoid zombie processes • Uses process control, socket connection and periodic ping to determine up time status• Restarts Resin if there are major problems • Advanced Reporting PDF• Internal monitoring • Post-mortem analysis • Resin Internal WatchDog Thread • Thread Dump/Log Dump • Watchers internal meters for problems • Meters and Graphs • Periodic Thread • Heap Dump Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  8. 8. RHS  Tracks  Metrics• Metrics are things like Available Memory, Number of Requests Per Minute, Garbage Collection Time, CPU Load, etc.• Metrics can be graphed• Tracks Historical Data for Trends• Can determine Anomalies• Can determine Trends• Can compare current data with baseline dataCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  9. 9. VisualizaFon• You can view data that Health System collects • Resin Web Admin • Watchdog Report • Post mortem PDF Report • Snapshot Report • PDF Report you can generate anytime • Trigger: CLI, REST, Through Web AdminCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  10. 10. RHS  and  Web  AdminCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  11. 11. RHS:  Health  Checks• RHS is highly configurable• Similar to the Resins "URL Rewrite" rules• Rules are configurable • checks, • conditions, • actions• Internal Watchdog periodic checksCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  12. 12. Watchdog  process • Lightweight process : Used to stop and start Resin instances • Can restart an instance if Java Monitoring / Server Monitoring / Health issue • Parent process of Resin Server Watchdog Process • Opens socket to Resin Server • Sends are-you-alive ping?Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  13. 13. Watchdog  Non  Stop  ModeCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  14. 14. Watchdog  Non  Stop  Mode• Resin is resilient• If a Denial of Service or unexpected Spike or Bug knocks down JVM, Resin restarts• Beyond that Resin can detect critical problems and do critical diagnostics so DevOps and Developers can get to root of problem• Resin long been product of choice for embedded devices, network appliances and large deployments• Non Stop mode makes Resin perfect for cloud deploymentsCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  15. 15. Resin Watch-Dog Watchdog ProcessCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  16. 16. Resin Watch-Dog Watchdog ProcessCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  17. 17. Resin Watch-Dog Starting Resin Process Ownership Resin Watchdog Process TCP LinkCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  18. 18. Resin Watch-Dog Non-Stop Up State Process Ownership Resin Watchdog Process TCP LinkCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  19. 19. Resin Watch-Dog Non-Stop Up State Process Ownership Resin Watchdog Process TCP LinkCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  20. 20. Resin Watch-Dog Non-Stop Up State Watchdog ProcessCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  21. 21. Internal  Watchdog  Thread  Inside  of  Resin Watchdog Process Resin Health System Watchdog Thread Resin ProcessCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  22. 22. Internal  Watchdog  Thread  Inside  of  Resin Watchdog Process Resin Health System Watchdog Thread Resin ProcessCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  23. 23. Internal  Watchdog  Health  Thread• Runs inside of Resin Server• Runs periodically • Collects data • Collects baseline data• Executes series of checks Resin Health System Watchdog Thread• Recheck failed conditions• Perform actions when conditions are critical or fatalCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  24. 24. Resin  Java  CDI  /  CanDI  and  Resin  Conf  based• RHS configuration extends Resin configuration file resin.xml• RHS uses CanDI (Resin’s Java CDI) • create and update Java objects, • XML tags exactly matches either a Java class or a Java property• CanDI means classes and config is in JavaDocs • Use HealthSystem JavaDoc • Use JavaDoc of the various checks, actions,Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  25. 25. Java  Doc  /  XML  conf  of  RHS• Startup delay : wait for baselined date before recording• Period: how often to check metrics• Recheck period: if some level has been crossed how often should RHS recheck to see if better Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  26. 26. Types  of  Health  ChecksCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  27. 27. Health  Checks  produce  StatusCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  28. 28. Resin  Checks  and  RespondsCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  29. 29. Health  System  AcFonsCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  30. 30. AcFons  based  on  condiFon• Actions can be grouped• If in critical state for two minutes perform group of actions• Dump JMX values, Dump Threads, Dump Heap, CPU Profile, Restart• If actions longer than 10 m, restart Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  31. 31. Collect  data  needed  to  diagnose  • When something goes wrong Bug • Denial of Service Attack • Application Bug • Unexpected Spike Denial of Service• RHS collects metrics you need to diagnose problem• Without collection, you are Spike flying blind Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  32. 32. AcFons  beQer  than  just  watchingCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  33. 33. Watch  dog  report  (PDF)• Post Mortem Report• Environment Info• Server Metrics• JVM Metrics• Thread Dump• Heap Dump• Metrics Graph Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  34. 34. Watch  dog  report  (PDF)• Post Mortem Report• Environment Info• Server Metrics• JVM Metrics• Thread Dump• Heap Dump• Metrics Graph Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  35. 35. Environment  Data• Collect critical information about environment• When,• What OS,• What version of Resin• How did Resin startup• And much more Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  36. 36. Health  Status• Status of Health Checks in Report Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  37. 37. Recent  Errors  and  Warnings• Recent Errors and Warnings Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  38. 38. Anomalies• Health Checking stores baseline• Anomalies are configurable triggers based on large changes from expected baseline• Anomaly detection is configurable can trigger actions Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  39. 39. Understanding  Anomaly  DetecFonCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  40. 40. Understanding  Anomaly  DetecFonCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  41. 41. Types  of  Metric  Graphs  in  Report• Cluster Status • Database Query Time• Request Count • NetStat• Request Time • JVM Memory• HTTP Request Errors • Heap Used• Log Warnings • Tenured Used• Threads • PermGen Used• CPU Usage • GC Time• Database Connection Active Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  42. 42. Sample  Graphs  Memory  and  GC  TimeCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  43. 43. Sample  Metric  Graphs  RequestCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  44. 44. GC  and  Memory  MetricsCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  45. 45. Heap  Dump• Heap dump critical for tracking down memory leaks• Also generates hprof file which can be analyzed by many third party tools Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  46. 46. CPU  Profile  /  Thread  Dumps• Critical for debugging thread deadlock issues Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  47. 47. Snapshot  report• Reports same type of data as watchdog• Watchdog report is a post- mortem analysis• Snapshots are whenever you feel like • e.g., during a stress test • trigger via REST, CLI and Web Admin Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  48. 48. ConclusionCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  49. 49. More  Background  Info  About  Health  System• Resin Health System : Java Monitoring and Server Monitoring built into Resin Application Server• Resin Health System : Current and Into the Future• Resin Application Server Fulfills Vision of Cloud Computing• Resin Health System EnhancementsCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  50. 50. More  Info• Caucho Technology | Home Page• Resin | Application Server• Resin | Java EE Web Profile Application Server• Resin - Cloud Support | 3G - Java Clustering• Resin | Java CDI | Dependency Injection / IoC• Resin - Health System | Java Monitoring and Server Monitoring• Download Resin | Application Server• Watch Resin | Application Server Featured VideoCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
  51. 51. Resin  Java  ApplicaFon  ServerCaucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web Server Copyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.

×