Direct Style Effect Systems -The Print[A] Example- A Comprehension Aid
OSMC 2013 | The future of Nagios by Andreas Ericsson
1. The future of Nagios
Andreas Ericsson
ageric79@gmail.com
2. Agenda
About me
About op5
Timeline
What happened?
A brief lookback
The future
3. About me
34 years old next week
Programming since I was seven
Work as core architect at op5
Nagios core developer 2009-2013
Performance fanatic
Author of Merlin and Nagios 4
4. About op5
Founded 2003
+900 customers
97% renewal rate
Focus on large installations
http://www.op5.com
5. Timeline
1999: Nagios project started
2007: Development stops, Ethan claims Nagios
is “complete”
2009: First fork, new core devs
2012: Nagios 4 ready for release
2013: Nagios 4 released, mailing lists shut
down, core devs removed, development stops,
“star-hunt” starts
2013: Naemon rises from the ashes
6. About Naemon
Networks, Applications and Event Monitor
Pronounced like “daemon”
Created last week
So far, it's mostly me
98% of the Nagios core dev effort
Focus: Performance, modularity,
extendability, maintainability, openness
7. Why fork?
Opensource projects should be open
Political BS has no place in technical projects
I disagree with marketing departments
dictating roadmaps and release dates
8. Why I got kicked
Official reasons:
One bad commit out of 973
op5 can't use spent dev time in marketing
Real reason (most likely):
Me being known as “the nagios developer” is bad
for Nagios Enterprises
10. Query handler
Input handler + API
query: “<address><SP><query>0”
A few built-in handlers
echo
help
core
wproc
nerd
11. NERD
Nagios Event Radio Dispatcher
Provides real-time data to addons outside
Nagios' core
Can reduce I/O load of current addons
Queried as 'nerd' via query-handler
Example queries:
@nerd subscribe hostcheck
@nerd subscribe servicechecks
demo time :)
12. Naemon Roadmap
External commands via query handler
Dropdir support
Runtime-modifiable main-config
Scheduler-controlled helper daemons
Check result transformer
Object extensions
Dynamic object creation at runtime
Livestatus
13. External commands via query handler
How?
Commands sent via query handler as key/value
vector
Why?
Proper error reporting
Easier to maintain, test and document
Support certain vars for all commands
username
trigger_time
14. Dropdir support
How?
Simple library
Minor rewrite of config parsing routines
Why?
Addon authors can distribute config easily
No need to modify nagios.cfg
Mass management becomes a lot easier
15. Runtime-modifiable main-config
How?
Minor rewrite of config parsing routines
Config queries can be sent to query handler
Not all variables can be modified
Why?
It comes “for free” with dropdir support
More generic than adding external commands
16. Scheduler-controlled helpers
How?
Helpers are launched at start
Helpers report to QH
Why?
Single-system start/stop
Assure no events are missed by helpers
Refuse to start if crucial helpers fail
Built-in keepalive
17. Check result transformer
How?
External helper connects to NERD
Events zip to the helper
Helper can alter state/perfdata/whatever
Helper zaps result back to core via QH
Why?
Bischeck, Anders Haal
Adaptive thresholds
Self-learning monitoring
Event correlation
18. Object extensions
How?
Loaded from special config files
Modules can request extensions to be read
Extension objects sent to loaded modules
Why?
Avoids linear lookup time of custom variables
Provides addon-manageable object-config (such as
NagVis map-coordinates)
19. Dynamic object creation
How?
Creatable via query handler requests
Clone existing host/service or make new
Existing objects can be disabled
Housekeeping event every X minutes/seconds
Why?
New stuff “call in” and monitoring starts
Monitoring shouldn't stop
20. Livestatus
How?
Build trickery
Why?
Other ways of getting status data suck
Common ground benefits entire community
Thanks!
Mathias Kettner
21. Questions?
Features are on the instalment plan, but all
should be completed by next year
http://www.github.com/naemon
http://git.op5.org
mailto:ageric79@gmail.com
ae@op5.com