Hatohol technical-brief-20130830-hbstudy

1,984 views

Published on

Project Hatohol
https://github.com/project-hatohol/hatohol
Kazuhiro Yamato
Teruo Oshida

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,984
On SlideShare
0
From Embeds
0
Number of Embeds
1,070
Actions
Shares
0
Downloads
8
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Hatohol technical-brief-20130830-hbstudy

  1. 1. Hatohol Technical Brief for hbstudy 2013/08/30 Project Hatohol https://github.com/project-hatohol/hatohol Kazuhiro Yamato Teruo Oshida Copyright © 2013 Project Hatohol, All Rights Reserved.
  2. 2. Chapter.2 Hatohol Server architecture 1 / 13
  3. 3. Hatohol Server 2 / 13
  4. 4. Hatohol Server An image of Hatohol Server ZABBIX Server Naigos Naigos Merged Data ZABBIX Server 3 / 13
  5. 5. Input and Output Intarface of Hatohol Server Hatohol Server ZABBIX Server Naigos Merged Data ZABBIX API (json-rpc) REST + json (HTTP) MySQL DB by NDOUtils MySQL Protocol Other clients can be connected. e.g. Android, iOS, and native Apps. 4 / 13
  6. 6. Internal data flow (rough sketch) Hatohol Server ArmZabbixAPI ArmNagios NDOUtils FaceRest UnifiedData Store DBClient Zabbix cache cache 5 / 13
  7. 7. In terms of Database Hatohol Server Zabbix Server Nagios NDOUtils cache cache config. log MySQL MySQL, PosgreSQL, ... MySQL, PosgreSQL SQLite3 - Easy use, light, and fast for the use from one app. - No need to make backup. - Important (we cannot lost) - Possible: HA configuration 6 / 13
  8. 8. In terms of Threads ArmZabbixAPI ArmNagios NDOUtils FaceRest UnifiedData Store DBClient Zabbix main() GLIB event loop 7 / 13
  9. 9. Data from ZABBIX zabbix_server apache PHP DB ArmZabbixAPI ZABBIX Server Hatohol Server libsoup json-glib Traffic is small. (only updated matters) Hatohol server can handle many ZABBIX servers8 / 13
  10. 10. Data from ZABBIX ● Call ZABBIX API during periodic intervals (Polling) ○ Triggers Current status: over/under threshold ? ○ Events History of trigger status change ○ Items (all items of all hosts) ■ on-demand taking ver. has been developed (turned off by default) 9 / 13
  11. 11. Data from Nagios (NDOUtils) ndo2db nagios MySQL ndomod ArmNDOUtils Hatohol Server mysqlclient doesn’t have save feature Another process Broker module 10 / 13
  12. 12. Data from Nagios (NDOUtils) ● SQL statement during periodic intervals (Polling) ○ Triggers (servicestatus) ○ Events (statehistory) ○ Items (servicestatus) ■ on-demand taking ver. has Not been developed 11 / 13
  13. 13. FaceRest ● Example ○ http://localhost:33194/triggers.json FaceRest libsoup json-glib { "apiVersion" : 1, "result" : true, "numberOfTriggers" : 11, "triggers" : [ { "status" : 0, "severity" : 3, "lastChangeTime" : 1372979757, "serverId" : 2, "hostId" : 1, "brief" : "OK - load average: 0.03, 0.08, 0.12" }, { "status" : 0, "severity" : 3, "lastChangeTime" : 1372979797, "serverId" : 2, "hostId" : 1, "brief" : "USERS OK - 0 users currently logged in" }, { "status" : 0, "severity" : 3, "lastChangeTime" : 1372979877, "serverId" : 2, "hostId" : 1, "brief" : "HTTP OK: HTTP/1.1 200 OK - 289 bytes in 0.001 second response time" }, { "status" : 0, "severity" : 3, "lastChangeTime" : 1372979877, "serverId" : 2, "hostId" : 1, "brief" : "PING OK - Packet loss = 0%, RTA = 0.04 ms" }, { "status" : 0, "severity" : 3, "lastChangeTime" : 1372979907, "serverId" : 2, "hostId" : 1, "brief" : "DISK OK - free space: / 46194 MB (96% inode=98%):" }, …. 12 / 13
  14. 14. Development language ● C++ ○ I like it and low-level programming. ○ Fast ○ Flexibility ■ Can manage memory, threads, etc. ■ Fine control ● user system call ● libraries (GLIB, libsoup…) ● system-specific feature 13 / 13
  15. 15. Chapter.3 Hatohol Client description 1 / 15
  16. 16. menu OUTLINE PRIMARY COMPONENTS FILES around a PAGE DATASTREAM around BROWSER PRIMARY ELEMENTS in USERCODE TEST at SERVER SIDE 2 / 15
  17. 17. Hatohol Client 3 / 15
  18. 18. PRIMARY COMPONENTS - Django - web application framework - for Python - WSGI compliant - bootstrap v2 - twitter’s css framework (toolkit) - jQuery https://www.djangoproject.com/ http://getbootstrap.com/2.3.2/ http://jquery.com/ 4 / 15
  19. 19. 5 / 15
  20. 20. FILES around a PAGE - Django view template base - Django view template individual html - library js - usercode js (written in individual html) ex. for dashboard page; $PRJ_HOME/viewer/base_ajax.html $PRJ_HOME/viewer/dashboard_ajax.html $PRJ_HOME/static/js/library.js 6 / 15
  21. 21. BLOCKS DEFINED in TEMPLATE BASE (base_ajax.html) <head> <meta name="viewport" content="width=device-width, initial-scale=1.0"></meta> <link href="{{ STATIC_URL }}css/bootstrap.css" rel="stylesheet" media="screen"></link> <link href="{{ STATIC_URL }}css/zabbix.css" rel="stylesheet" media="screen"></link> <title> {% block title %} {% endblock %} </title> </head> <body style="padding-top: 60px"> <div class="navbar navbar-inverse navbar-fixed-top"> <div class="navbar-inner"> <a class="brand">&nbsp;<i>Hatohol</i>&nbsp;</a> <ul class="nav"> {% block navbar %} {% endblock %} </ul> <div class="pull-right btn-group" id ="sts"> <button class="btn dropdown-toggle btn-info" data-toggle="dropdown"> <span>PREPARE</span> <span class="caret"></span> </button> <ul class="dropdown-menu"> <li>準備中</li> </ul> </div> </div> </div> {% block main %} {% endblock %} <script src="{{ STATIC_URL }}js/jquery.js"></script> <script src="{{ STATIC_URL }}js/bootstrap.js"></script> <script src="{{ STATIC_URL }}js/library.js"></script> {% block option %} {% endblock %} {% block logic %} {% endblock %} <div style="text-align: center;">Copyright &copy; 2013 Project Hatohol</div> </body> 1.title 2.navbar 3.main 4.option 5.logic 7 / 15
  22. 22. Hatohol Client ① ② HTTP REST API on HTTP ・html, css ・js ・image ・json ・browse page:① ・exec js usercode ・$getJSON():② ・draw data DATA STREAM around BROWSER Django R.Proxy (impl w/ Django) Server 8 / 15
  23. 23. Hatohol Client apache R.Proxy ① ② HTTP REST API on HTTP ・json ・browse page:① ・exec js usercode ・$getJSON():② ・draw data DATA STREAM around BROWSER / minimize Server be static pre-compiled ・html, css ・js ・image 9 / 15
  24. 24. PRIMARY ELEMENTS in USERCODE html objects: - table variables: - raw data (JSON, just returned by Server) - parsed data functions: - parseData() - drawData() parseData() drawData() raw data parsed data <table> : </table> $(“#table”).empty() $(“#table”).append(...); 10 / 15
  25. 25. SAMPLE of RAW DATA (events_ajax.html) { apiVersion : 1, result : true, numberOfEvents : 2229, events : [ { serverId: 1, hostId: 10084, triggerId: 13520, time: 1368496195, type: 0, status: 0, severity: 2, brief: "Free disk space is less than 20%" }, { serverId: 1, hostId: 10084, triggerId: 13498, time: 1368496761, type: 1, status: 0, severity: 2, brief: "Disk I/O is overloaded" }, { serverId: 2, hostId: 10061, triggerId: 13681, time: 1359087529, type: 2, status: 0, severity: 3, brief: "FTP server is down" }, { serverId: 2, hostId: 10061, triggerId: 13682, time: 1359087530, type: 2, status: 1, severity: 3, brief: "WEB (HTTP) server is down" }, { serverId: 2, hostId: 10061, triggerId: 13683, time: 1359087531, type: 2, status: 1, severity: 3, brief: "IMAP server is down" }, { serverId: 2, hostId: 10061, triggerId: 13684, time: 1359087532, type: 2, status: 1, severity: 3, brief: "News (NNTP) server is down" }, { serverId: 2, hostId: 10061, triggerId: 13685, time: 1359087533, type: 2, status: 1, severity: 3, brief: "POP3 server is down" }, { serverId: 2, hostId: 10061, triggerId: 13686, time: 1359087534, type: 2, status: 0, severity: 3, brief: "Email (SMTP) server is down" }, { serverId: 2, hostId: 10061, triggerId: 13700, time: 1359087530, type: 2, status: 0, severity: 4, brief: "Lack of free swap space" }, ], } 11 / 15
  26. 26. 12 / 15
  27. 27. { diffs: { 1: # serverId { 13001: # triggerId { 1359087525: 5, # time and delta of time 1359087530: 10, 1359087540: 99999, }, 13002: { 1359080912: 99999, }, }, 2: { 13001: { 1359087600: 100, 1359087700: 100, 1359087800: 99999, }, } }, } SAMPLE of PARSED DATA (events_ajax.html) 13 / 15
  28. 28. TEST POINT in USERCODE - input, output of parseData() - input, output of drawData() parseData() drawData() parsed data <table> : </table> raw data 14 / 15
  29. 29. TEST at SERVER SIDE EXPERIMENTAL : ON THE commonjs BRANCH - runnable both Server Side and Client Side (on Browser) - run with teajs (oldly called v8cgi) - CommonJS compliant - Modules/1.x - Unit Testing/1.0 - no loader is used - alt. Django view template directive ‘include’ http://code.google.com/p/teajs/ http://www.commonjs.org/ 15 / 15
  30. 30. Chapter.4 Future Plan 1 / 9
  31. 31. Milestone https://github.com/project-hatohol/hatohol/issues/milestones 2013/09 v.0.1 2013/12 v.0.2 2014/03 v.0.3 - Action framework - HA configuration (experimental) - Useful action templates & exampes - Graph - Sophisticated UI in WebClient for making actions Major features 2 / 9
  32. 32. Action When ZABBIX/Nagios detect any trouble (condition) Hatohol => any action Different kinds of actions: by the combination of ● Server ID (Zabbix server or Nagios ID) ● Host ID ● host group ID ● Trigger ID ● Trigger status (OK or Problem) ● Trigger severity 3 / 9
  33. 33. Hatohol’s Action ● provides two kinds of actions ○ Command (executed by an event) ■ binary, shell script, LL script. ○ Resident (stay as a process) ■ executes a function in the module by an event. ■ module (developed by) ● C/C++ (shared library) ● Python 4 / 9
  34. 34. Motivation of resident ● To handle complicated condtions ● For example, ○ Send e-mail when a trouble detects ○ Disable sending e-mail for 10min. ■ The times of events should be recorded ○ After 10min. send the number of events ○ The next event should be emailed ○ Send the number of events by e-mail every 60min It’d better be handled by a program. But the state is also should be saved. => It is difficult for a command 5 / 9
  35. 35. Picture of the resident action mechanism Hatohol Server hatohol-resident-yard usermodule.so int num_events; event_handler() { … } Connected with pipe - Different processes - No damage if user module crashes -Relaunchable Save a state as variables Called user written handler by an event 6 / 9
  36. 36. HA configuration (Under study…) Zabbix Server Zabbix Server Nagios (NDOUtils) Hatohol Server Hatohol Server HA config. MySQL Server config. log Note: Just one idea... Traffic x2 But not so heavy 7 / 9
  37. 37. Graph (Now no concrete ideas…) ● Basically similar to ZABBIX’s graph ? ○ users can specify duration ○ Shows maximum, minimum, average ● Are there other useful features ? 8 / 9
  38. 38. Hatohol is open source software. ● Please tell us any requirements ● Of course, bug report and bug fix are very welcome ● Project site (Github) ○ https://github.com/project-hatohol/hatohol 9 / 9

×