SlideShare a Scribd company logo
1 of 16
Download to read offline
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
An Introduction to Monitoring with Nagios
Laurent Andrey R´emi Badonnel
LORIA - INRIA Grand Est
ISSNSM’2008, Zurich
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 1/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Introduction
Nagios at a glance
Key concepts
Functional architecture
Services and service states
Nagios configuration
Object definitions
Other elements
Example scenario
Checks and their execution
Local checks
Remote checks
Advanced configurations
Conclusions
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 2/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
References
Motivations
References
www.nagios.org: official distribution (core, plugins and
documentation)
www.nagiosexchange.org: lots of complementary plugins
W. Barth.
Nagios, System and Network Monitoring.
Open Source Press GmbH, first edition, 2006.
ISBN: 1-59327-070-4.
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 3/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
References
Motivations
What is Nagios? What is Nagios useful for?
A widely-used monitoring tool for trouble-shooting
Simple and open source
Network, system, service levels
With a sophisticated (?) notification system to inform
administrators when something goes wrong
Nagios provides support to administrator(s) for detecting
problems before users (including the boss!)
Mail server failure
Hard drive overload
Network outage
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 4/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Key concepts
Functional architecture
Services and service states
Key concepts
Colored area concept
Green/Yellow/Red (Ok/Warning/Critical)
No performance analysis or display (a priori)
Checks using external commands (plugins)
Various possibilities for remote checks
Possibility for passive checks (from managed resources)
Web interface + notifications
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 5/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Key concepts
Functional architecture
Services and service states
Functional architecture
Notification System
Filters, Contact Delivery
Data Reaping
Resource State Calculation
Web Interfaceconfig
log
NAGIOS
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 6/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Key concepts
Functional architecture
Services and service states
Architecture at run-time
Data reaping + notification system= nagios processes
Can be run as a service (rcX.d, soft runlevel)
Web interface
External web server (Apache)
Bunch of cgi scripts (part of Nagios)
Configuration
Simple text files
Or a postgres database
Logs (local files)
Named pipe (unix domain socket) to enable nagios to receive
commands (from cgi, passive asynchronous events)
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 7/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Key concepts
Functional architecture
Services and service states
Service, service check
Service
Service delivered by a software
Percentage of free space on a partition
Bandwith usage on a network interface ...
Service check
Provides state information on a service
Returns a value: OK, WARNING, CRITICAL (exit status 0, 1,
2), UNKNOWN (exit status 3, due to time out or plugin
runtime trouble) to reflect the Nagios view about this service
Can be local (OS calls) or remote (ICMP, NRPE, SNMP ...)
Is implemented by a plugin (external command/script)
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 8/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Key concepts
Functional architecture
Services and service states
Service states
Service states are the mirror of what nagios observes
States: OK, WARNING, CRITICAL, UNKNOWN
Transitions from one state to another one based on results
provided by checks
Critical and warning states are shadowed by related soft states
A service goes first to a soft state
Attempt count mecanism to reach a definitive hard state
User notification can only occur when hard states are reached
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 9/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Key concepts
Functional architecture
Services and service states
Service state diagram
Soft Warning
warning
ac ==mca
↑ problem
GG
warning
ac + +
ac < mca
××
ok ; ac ← 1
mmmmmmm
vvmmmmmmmmm
critical
ac + +
ac <mca

critical
ac ==mca
↑ problem
AA
Hard Warning
critical

warning

ok ; ac ← 1 ; ↑ recovery

OK
warning
ac + +
ac  mca GG
ok
ac ← 1 RR
critical
ac + +
ac mca
GG Soft Critical critical
ac + +
ac ==mca
↑ problem
GG
critical
ac + +
ac  mca
ˆˆ
ok ; ac ← 1RRRRRRR
hhRRRRRRRR
warning
ac + +
ac mca
‚‚
warning
ac ==mca
↑ problem
XXuuuuuuuuuuuuuuuuuuuuuuuuuuu
Hard Critical
warning
vv
critical
‚‚
ok ; ac ← 1 ; ↑ recovery
yy
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 10/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Key concepts
Functional architecture
Services and service states
Service state diagram legend
HS hard state, using normal check interval between 2
checks
S soft state, using retry check interval between 2 checks
RS GG transition triggered by a check with a return status of
RS∈ { ok, warning, critical }
ac++GG Attempt Count is incremented when transition is
triggered
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 11/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Key concepts
Functional architecture
Services and service states
Service state diagram legend (ctd)
acmcaGG transition is triggered if Attempt Count is smaller
than the service configuration attribute:
max check attempts
ac==mcaGG transition is triggered if Attempt Count is equal to
the service configuration attribute: max check attempts
ac←1 GG Attempt Count is set to 1 then the transition is
triggered
↑notification
GG a user notification ∈ { problem, recovery } is
generated when this transition is triggered
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 12/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Object definitions
Other elements
Example scenario
Nagios configuration
Object-oriented representation
A nagios object describes a specific unit: a service, an host, a
contact, a contactgroup ... with attributes and values
kind of inheritance mechanisms, dependencies amongst objets
Set of configuration files
Main file: nagios.cfg (ref. to other cfg. files)
1 file per object type: services.cfg, hosts.cfg, contacts.cfg,
checkcommands.cfg, misccommands.cfg, timeperiods.cfg ...
Requires an a priori knowledge
Configuration can also stand into a database
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 13/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Object definitions
Other elements
Example scenario
Host
A service have to be linked to an host
Only UP and DOWN states
User notification (problem, recovery)
Same external checks than services
UP = (WARNING or OK), DOWN = CRITICAL
Typically: ICMP-based checks
No active checks if related services are OK
Host group (cosmetic ...)
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 14/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Object definitions
Other elements
Example scenario
Other Nagios elements
contact, contactgroup: to specify who to notify, how to notify
and when to notify
Used in host and service definitions
command: to execute check plugins or to send user
notifications
Links Nagios attributes found in definitions (ex: host name) to
plugin parameters
Already provided for most common plugins (see
checkcommands.cfg file)
Basic wrapping to send emails (misccommands.cfg file)
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 15/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Object definitions
Other elements
Example scenario
Example scenario
Internet
webloria
152.81.144.22
http://www.loria.fr
dnsext
152.81.144.14
cat-481
dns1, preny
152.81.1.25
dns2, hugo
152.81.1.128
nagios
152.81.9.211
152.81.0.0/20
152.81.144.0/20
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 16/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Object definitions
Other elements
Example scenario
Host definition (example)
hosts.cfg
d e f i n e host {
host name w e b l o r i a
a l i a s w e b l or i a l i n u x machine
address 152.81.144.22
check command check−host−a l i v e
max check attempts 3
check period 24x7
n o t i f i c a t i o n i n t e r v a l 180 # 3 hours
n o t i f i c a t i o n p e r i o d 24x7
n o t i f i c a t i o n o p t i o n s d , r , f , u
# down , recovery , flapping , unreachable
contact groups a d m i n i s t r a t o r s
}
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 17/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Object definitions
Other elements
Example scenario
Service definition (example 1)
services.cfg
d e f i n e s e r v i c e {
host name w e b l o r i a
s e r v i c e d e s c r i p t i o n http s e r v i c e
check command check http
max check attempts 3
n o r m a l c h e c k i n t e r v a l 5
r e t r y c h e c k i n t e r v a l 1
check period 24x7
n o t i f i c a t i o n i n t e r v a l 180
n o t i f i c a t i o n p e r i o d 24x7
n o t i f i c a t i o n o p t i o n s w, c , r , f , u
# warning , c r i t i c a l , recovery , flapping , unreachable
contact groups a d m i n i s t r a t o r s
}
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 18/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Object definitions
Other elements
Example scenario
Command definitions (example 1)
checkcommands.cfg
d e f i n e command{
command name check−host−a l i v e
command line $USER1$/ check icmp −H $HOSTADDRESS$
}
d e f i n e command{
command name check http
command line $USER1$/ check http −H $HOSTADDRESS$
}
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 19/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Object definitions
Other elements
Example scenario
Service definition (example 2)
services.cfg
d e f i n e s e r v i c e {
host name dnsext
s e r v i c e d e s c r i p t i o n dns s e r v i c e
check command check name for given dns
!www. l o r i a . f r ! 152.81.144.22
max check attempts 3
n o r m a l c h e c k i n t e r v a l 5
r e t r y c h e c k i n t e r v a l 1
check period 24x7
n o t i f i c a t i o n i n t e r v a l 180
n o t i f i c a t i o n p e r i o d 24x7
n o t i f i c a t i o n o p t i o n s w, c , r , f , u
contact groups a d m i n i s t r a t o r s
}
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 20/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Object definitions
Other elements
Example scenario
Command definition (example 2)
checkcommands.cfg
d e f i n e command{
command name check name for given dns
command line $USER1$/ check dns −H $ARG1$ −a $ARG2$
−s $HOSTADDRESS$
# $ARG1$ : f u l l y q u a l i f i e d name
# $ARG2$ : IP address
}
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 21/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Object definitions
Other elements
Example scenario
Contact definition (example)
contacts.cfg
d e f i n e contact {
contact name andrey
a l i a s l a u r e n t andrey
s e r v i c e n o t i f i c a t i o n p e r i o d 24x7
h o s t n o t i f i c a t i o n p e r i o d 24x7
s e r v i c e n o t i f i c a t i o n o p t i o n s w, u , c , r
h o s t n o t i f i c a t i o n o p t i o n s d , r
s e r v i c e n o t i f i c a t i o n c o m m a n d s n o t i f y −by−email
host notification commands host−n o t i f y −by−email
email a n d r e y @ l o r i a . f r
}
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 22/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Object definitions
Other elements
Example scenario
Contactgroup definition (example)
contacts.cfg
d e f i n e contactgroup {
contactgroup name a d m i n i s t r a t o r s
a l i a s group of a d m i n i s t r a t o r s
members andrey
}
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 23/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Local checks
Remote checks
Local checks
Getting information about your local system
Plugins based on system commands (such as ps, df, uptime)
check_disk -w 30% -c 15% -p /var
check_load -w 2.0,1.0,0.5 -c 4.0,2.0,1.0
check_procs -w 150 -c 250 --metric=PROCS
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 24/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Local checks
Remote checks
Direct network checks
Checking network services of remote hosts
Plugins based on network protocols
Directly executed from the Nagios machine
check_icmp -H 1.14.1.2 -w 100.0,20% -c 200.0,40%
check_tcp -H boston.loria.fr -p 7000
check_ftp -H ftp.inria.fr -p 21 -e 220
check_http -H http://www.esial.uhp-nancy.fr
check_smtp, check_imap, check_pop
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 25/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Local checks
Remote checks
Checks using SSH (Secure Shell)
check_by_ssh (plugin)
sshd
(daemon)
check_xyz
(plugin)
NAGIOS CLIENT
priv.key pub.key
NETWORK
Nagios executes the check by ssh plugin
To run a plugin deployed on the remote machine
Based on asymetric keys to log without typing a password
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 26/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Local checks
Remote checks
Checks using NRPE (Nagios Remote Plugin Executor)
check_nrpe (plugin)
ref
nrpe (daemon)
check_xyz
(plugin)
NAGIOS CLIENT
tcp connection
NETWORK
ref1 check_xyz arg1 arg2
ref2 check_abc arg3 arg4
... ...
Nagios executes the check nrpe plugin
To interact with a dedicated daemon called nrpe
Based on pre-configured plugin invocations
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 27/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Local checks
Remote checks
Other remote checks
Checks using SNMP
Collecting management information from SNMP agents
check_snmp plugin ⇔ net-snmp snmpget
SNMP reply + warning and critical limits ⇒ service state
Checks using NSCA (Nagios Service Check Acceptor)
Passive method where checks are initiated by the resources
themselves (close to SNMP traps)
A NSCA daemon waits for incoming check results (on the
Nagios machine), while the send_nsca program (on the
remote machines) sends messages containing check results
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 28/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Making configuration more simple
Specifying Dependencies
Making configuration more simple
Monitoring the same service on several hosts
setting the host name attribute of the service as a comma
separated list of host names
or setting an hostgroup name attribute for the service
Defining template-based objects
Notion of inheritance
Factorizing many low-interest attributes
register attribute to define a template
use attribute to inherite from a template
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 29/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Making configuration more simple
Specifying Dependencies
Service and hostgroup (example)
d e f i n e hostgroup{
hostgroup name d n s h o s t s
a l i a s h o s t s s u p p o r t i n g DNS
members dns1 , dns2 , dnsext
}
d e f i n e s e r v i c e {
hostgroup name d n s h o s t s
s e r v i c e d e s c r i p t i o n dns s e r v i c e
check command c h e c k n a m e f o r g i v e n d n s
!www. l o r i a . f r ! 1 5 2 . 8 1 . 1 4 4 . 2 2
max check attempts 3
n o r m a l c h e c k i n t e r v a l 5
r e t r y c h e c k i n t e r v a l 1
c h e c k p e r i o d 24 x7
n o t i f i c a t i o n i n t e r v a l 180
n o t i f i c a t i o n p e r i o d 24 x7
n o t i f i c a t i o n o p t i o n s w, c , r , f , u
c o n t a c t g r o u p s a d m i n i s t r a t o r s
}
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 30/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Making configuration more simple
Specifying Dependencies
Specifying dependencies
Remainder: service critical state ⇒ host check
How does Nagios make a difference between:
case 1: webloria is really down
case 2: webloria is unreachable due to the network?
Solution: defining dependency relationships amongst objects
(parents attribute)
If an host is detected as down, the parent is checked
If the parent is OK, the initial host is really declared down
If not, the host is declared unreachable
Ex: cat-481 router defined as a parent of webloria
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 31/32
Introduction
Nagios at a glance
Nagios configuration
Checks and their execution
Advanced configurations
Making configuration more simple
Specifying Dependencies
Conclusions
Open source monitoring solution
Based on simple concepts: checks, states, notifications
Easily extensible and integrable
No discovery mechanism
Experience it during lab exercises!
Monitoring hosts and their services
Developing and testing our own plugin
Experimenting state transitions and notifications
Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 32/32

More Related Content

Similar to Nagios

OSMC 2012 | Shinken by Jean Gabès
OSMC 2012 | Shinken by Jean GabèsOSMC 2012 | Shinken by Jean Gabès
OSMC 2012 | Shinken by Jean GabèsNETWAYS
 
Nagios Conference 2012 - Ethan Galstad - Keynote
Nagios Conference 2012 - Ethan Galstad - KeynoteNagios Conference 2012 - Ethan Galstad - Keynote
Nagios Conference 2012 - Ethan Galstad - KeynoteNagios
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructureFernando Lopez Aguilar
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howAltinity Ltd
 
Troubleshooting tips from docker support engineers
Troubleshooting tips from docker support engineersTroubleshooting tips from docker support engineers
Troubleshooting tips from docker support engineersDocker, Inc.
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetesRishabh Indoria
 
JBoss Drools and Drools Fusion (CEP): Making Business Rules react to RTE
JBoss Drools and Drools Fusion (CEP): Making Business Rules react to RTEJBoss Drools and Drools Fusion (CEP): Making Business Rules react to RTE
JBoss Drools and Drools Fusion (CEP): Making Business Rules react to RTEtsurdilovic
 
手把手教你如何串接 Log 到各種網路服務
手把手教你如何串接 Log 到各種網路服務手把手教你如何串接 Log 到各種網路服務
手把手教你如何串接 Log 到各種網路服務Mu Chun Wang
 
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...Ambassador Labs
 
Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Free GitOps Workshop (with Intro to Kubernetes & GitOps)Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Free GitOps Workshop (with Intro to Kubernetes & GitOps)Weaveworks
 
Zou Layered VO PDCAT2008 V0.5 Concise
Zou Layered VO PDCAT2008 V0.5 ConciseZou Layered VO PDCAT2008 V0.5 Concise
Zou Layered VO PDCAT2008 V0.5 Conciseyongqiangzou
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Brian Brazil
 
Autonomic Application Delivery with Tonomi
Autonomic Application Delivery with TonomiAutonomic Application Delivery with Tonomi
Autonomic Application Delivery with TonomiVictoria Livschitz
 
My past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsaiMy past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsaiKim Kao
 
Behind the Code 'September 2022 // by Exness
Behind the Code 'September 2022 // by ExnessBehind the Code 'September 2022 // by Exness
Behind the Code 'September 2022 // by ExnessMaxim Gaponov
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaborationJulien Pivotto
 

Similar to Nagios (20)

Nagios
NagiosNagios
Nagios
 
OSMC 2012 | Shinken by Jean Gabès
OSMC 2012 | Shinken by Jean GabèsOSMC 2012 | Shinken by Jean Gabès
OSMC 2012 | Shinken by Jean Gabès
 
Nagios Conference 2012 - Ethan Galstad - Keynote
Nagios Conference 2012 - Ethan Galstad - KeynoteNagios Conference 2012 - Ethan Galstad - Keynote
Nagios Conference 2012 - Ethan Galstad - Keynote
 
Monitoring federation open stack infrastructure
Monitoring federation open stack infrastructureMonitoring federation open stack infrastructure
Monitoring federation open stack infrastructure
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and how
 
Troubleshooting tips from docker support engineers
Troubleshooting tips from docker support engineersTroubleshooting tips from docker support engineers
Troubleshooting tips from docker support engineers
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
JBoss Drools and Drools Fusion (CEP): Making Business Rules react to RTE
JBoss Drools and Drools Fusion (CEP): Making Business Rules react to RTEJBoss Drools and Drools Fusion (CEP): Making Business Rules react to RTE
JBoss Drools and Drools Fusion (CEP): Making Business Rules react to RTE
 
手把手教你如何串接 Log 到各種網路服務
手把手教你如何串接 Log 到各種網路服務手把手教你如何串接 Log 到各種網路服務
手把手教你如何串接 Log 到各種網路服務
 
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
 
Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Free GitOps Workshop (with Intro to Kubernetes & GitOps)Free GitOps Workshop (with Intro to Kubernetes & GitOps)
Free GitOps Workshop (with Intro to Kubernetes & GitOps)
 
Rudder 3.0 and beyond
Rudder 3.0 and beyondRudder 3.0 and beyond
Rudder 3.0 and beyond
 
Zou Layered VO PDCAT2008 V0.5 Concise
Zou Layered VO PDCAT2008 V0.5 ConciseZou Layered VO PDCAT2008 V0.5 Concise
Zou Layered VO PDCAT2008 V0.5 Concise
 
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
Monitoring Kubernetes with Prometheus (Kubernetes Ireland, 2016)
 
Autonomic Application Delivery with Tonomi
Autonomic Application Delivery with TonomiAutonomic Application Delivery with Tonomi
Autonomic Application Delivery with Tonomi
 
My past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsaiMy past-3 yeas-developer-journey-at-linkedin-by-iantsai
My past-3 yeas-developer-journey-at-linkedin-by-iantsai
 
Neighborly nagios
Neighborly nagiosNeighborly nagios
Neighborly nagios
 
Behind the Code 'September 2022 // by Exness
Behind the Code 'September 2022 // by ExnessBehind the Code 'September 2022 // by Exness
Behind the Code 'September 2022 // by Exness
 
Binding android piece by piece
Binding android piece by pieceBinding android piece by piece
Binding android piece by piece
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaboration
 

More from ♛Kumar Aneesh♛ (19)

Introduction about Kubernates Cluster
Introduction about Kubernates ClusterIntroduction about Kubernates Cluster
Introduction about Kubernates Cluster
 
Live issues resolution on Kubernates Cluster
Live issues resolution on Kubernates ClusterLive issues resolution on Kubernates Cluster
Live issues resolution on Kubernates Cluster
 
My onboarding
My onboardingMy onboarding
My onboarding
 
Monitor it accesspending
Monitor it accesspendingMonitor it accesspending
Monitor it accesspending
 
Monitor it access
Monitor it accessMonitor it access
Monitor it access
 
Missing on both stacks
Missing on both stacksMissing on both stacks
Missing on both stacks
 
load errorcmd
 load errorcmd load errorcmd
load errorcmd
 
Kibana
KibanaKibana
Kibana
 
tp smarts_onboarding
 tp smarts_onboarding tp smarts_onboarding
tp smarts_onboarding
 
derby onboarding (1)
derby onboarding (1)derby onboarding (1)
derby onboarding (1)
 
Issues chmk (1)
Issues chmk (1)Issues chmk (1)
Issues chmk (1)
 
Invoke
InvokeInvoke
Invoke
 
Inventory after onboarding
Inventory after onboardingInventory after onboarding
Inventory after onboarding
 
Implementation c100229805
Implementation c100229805Implementation c100229805
Implementation c100229805
 
Icmp s outh
Icmp s outhIcmp s outh
Icmp s outh
 
Hws&amp;reigate south fitzoy
Hws&amp;reigate south fitzoyHws&amp;reigate south fitzoy
Hws&amp;reigate south fitzoy
 
Hempoint
HempointHempoint
Hempoint
 
Dmctl cmd
Dmctl cmdDmctl cmd
Dmctl cmd
 
Asg dashboard usage_guide_v1
Asg dashboard usage_guide_v1Asg dashboard usage_guide_v1
Asg dashboard usage_guide_v1
 

Recently uploaded

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Nagios

  • 1. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations An Introduction to Monitoring with Nagios Laurent Andrey R´emi Badonnel LORIA - INRIA Grand Est ISSNSM’2008, Zurich Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 1/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Introduction Nagios at a glance Key concepts Functional architecture Services and service states Nagios configuration Object definitions Other elements Example scenario Checks and their execution Local checks Remote checks Advanced configurations Conclusions Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 2/32
  • 2. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations References Motivations References www.nagios.org: official distribution (core, plugins and documentation) www.nagiosexchange.org: lots of complementary plugins W. Barth. Nagios, System and Network Monitoring. Open Source Press GmbH, first edition, 2006. ISBN: 1-59327-070-4. Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 3/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations References Motivations What is Nagios? What is Nagios useful for? A widely-used monitoring tool for trouble-shooting Simple and open source Network, system, service levels With a sophisticated (?) notification system to inform administrators when something goes wrong Nagios provides support to administrator(s) for detecting problems before users (including the boss!) Mail server failure Hard drive overload Network outage Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 4/32
  • 3. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Key concepts Functional architecture Services and service states Key concepts Colored area concept Green/Yellow/Red (Ok/Warning/Critical) No performance analysis or display (a priori) Checks using external commands (plugins) Various possibilities for remote checks Possibility for passive checks (from managed resources) Web interface + notifications Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 5/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Key concepts Functional architecture Services and service states Functional architecture Notification System Filters, Contact Delivery Data Reaping Resource State Calculation Web Interfaceconfig log NAGIOS Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 6/32
  • 4. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Key concepts Functional architecture Services and service states Architecture at run-time Data reaping + notification system= nagios processes Can be run as a service (rcX.d, soft runlevel) Web interface External web server (Apache) Bunch of cgi scripts (part of Nagios) Configuration Simple text files Or a postgres database Logs (local files) Named pipe (unix domain socket) to enable nagios to receive commands (from cgi, passive asynchronous events) Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 7/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Key concepts Functional architecture Services and service states Service, service check Service Service delivered by a software Percentage of free space on a partition Bandwith usage on a network interface ... Service check Provides state information on a service Returns a value: OK, WARNING, CRITICAL (exit status 0, 1, 2), UNKNOWN (exit status 3, due to time out or plugin runtime trouble) to reflect the Nagios view about this service Can be local (OS calls) or remote (ICMP, NRPE, SNMP ...) Is implemented by a plugin (external command/script) Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 8/32
  • 5. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Key concepts Functional architecture Services and service states Service states Service states are the mirror of what nagios observes States: OK, WARNING, CRITICAL, UNKNOWN Transitions from one state to another one based on results provided by checks Critical and warning states are shadowed by related soft states A service goes first to a soft state Attempt count mecanism to reach a definitive hard state User notification can only occur when hard states are reached Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 9/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Key concepts Functional architecture Services and service states Service state diagram Soft Warning warning ac ==mca ↑ problem GG warning ac + + ac < mca ×× ok ; ac ← 1 mmmmmmm vvmmmmmmmmm critical ac + + ac <mca critical ac ==mca ↑ problem AA Hard Warning critical warning ok ; ac ← 1 ; ↑ recovery OK warning ac + + ac mca GG ok ac ← 1 RR critical ac + + ac mca GG Soft Critical critical ac + + ac ==mca ↑ problem GG critical ac + + ac mca ˆˆ ok ; ac ← 1RRRRRRR hhRRRRRRRR warning ac + + ac mca ‚‚ warning ac ==mca ↑ problem XXuuuuuuuuuuuuuuuuuuuuuuuuuuu Hard Critical warning vv critical ‚‚ ok ; ac ← 1 ; ↑ recovery yy Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 10/32
  • 6. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Key concepts Functional architecture Services and service states Service state diagram legend HS hard state, using normal check interval between 2 checks S soft state, using retry check interval between 2 checks RS GG transition triggered by a check with a return status of RS∈ { ok, warning, critical } ac++GG Attempt Count is incremented when transition is triggered Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 11/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Key concepts Functional architecture Services and service states Service state diagram legend (ctd) acmcaGG transition is triggered if Attempt Count is smaller than the service configuration attribute: max check attempts ac==mcaGG transition is triggered if Attempt Count is equal to the service configuration attribute: max check attempts ac←1 GG Attempt Count is set to 1 then the transition is triggered ↑notification GG a user notification ∈ { problem, recovery } is generated when this transition is triggered Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 12/32
  • 7. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Object definitions Other elements Example scenario Nagios configuration Object-oriented representation A nagios object describes a specific unit: a service, an host, a contact, a contactgroup ... with attributes and values kind of inheritance mechanisms, dependencies amongst objets Set of configuration files Main file: nagios.cfg (ref. to other cfg. files) 1 file per object type: services.cfg, hosts.cfg, contacts.cfg, checkcommands.cfg, misccommands.cfg, timeperiods.cfg ... Requires an a priori knowledge Configuration can also stand into a database Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 13/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Object definitions Other elements Example scenario Host A service have to be linked to an host Only UP and DOWN states User notification (problem, recovery) Same external checks than services UP = (WARNING or OK), DOWN = CRITICAL Typically: ICMP-based checks No active checks if related services are OK Host group (cosmetic ...) Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 14/32
  • 8. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Object definitions Other elements Example scenario Other Nagios elements contact, contactgroup: to specify who to notify, how to notify and when to notify Used in host and service definitions command: to execute check plugins or to send user notifications Links Nagios attributes found in definitions (ex: host name) to plugin parameters Already provided for most common plugins (see checkcommands.cfg file) Basic wrapping to send emails (misccommands.cfg file) Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 15/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Object definitions Other elements Example scenario Example scenario Internet webloria 152.81.144.22 http://www.loria.fr dnsext 152.81.144.14 cat-481 dns1, preny 152.81.1.25 dns2, hugo 152.81.1.128 nagios 152.81.9.211 152.81.0.0/20 152.81.144.0/20 Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 16/32
  • 9. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Object definitions Other elements Example scenario Host definition (example) hosts.cfg d e f i n e host { host name w e b l o r i a a l i a s w e b l or i a l i n u x machine address 152.81.144.22 check command check−host−a l i v e max check attempts 3 check period 24x7 n o t i f i c a t i o n i n t e r v a l 180 # 3 hours n o t i f i c a t i o n p e r i o d 24x7 n o t i f i c a t i o n o p t i o n s d , r , f , u # down , recovery , flapping , unreachable contact groups a d m i n i s t r a t o r s } Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 17/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Object definitions Other elements Example scenario Service definition (example 1) services.cfg d e f i n e s e r v i c e { host name w e b l o r i a s e r v i c e d e s c r i p t i o n http s e r v i c e check command check http max check attempts 3 n o r m a l c h e c k i n t e r v a l 5 r e t r y c h e c k i n t e r v a l 1 check period 24x7 n o t i f i c a t i o n i n t e r v a l 180 n o t i f i c a t i o n p e r i o d 24x7 n o t i f i c a t i o n o p t i o n s w, c , r , f , u # warning , c r i t i c a l , recovery , flapping , unreachable contact groups a d m i n i s t r a t o r s } Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 18/32
  • 10. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Object definitions Other elements Example scenario Command definitions (example 1) checkcommands.cfg d e f i n e command{ command name check−host−a l i v e command line $USER1$/ check icmp −H $HOSTADDRESS$ } d e f i n e command{ command name check http command line $USER1$/ check http −H $HOSTADDRESS$ } Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 19/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Object definitions Other elements Example scenario Service definition (example 2) services.cfg d e f i n e s e r v i c e { host name dnsext s e r v i c e d e s c r i p t i o n dns s e r v i c e check command check name for given dns !www. l o r i a . f r ! 152.81.144.22 max check attempts 3 n o r m a l c h e c k i n t e r v a l 5 r e t r y c h e c k i n t e r v a l 1 check period 24x7 n o t i f i c a t i o n i n t e r v a l 180 n o t i f i c a t i o n p e r i o d 24x7 n o t i f i c a t i o n o p t i o n s w, c , r , f , u contact groups a d m i n i s t r a t o r s } Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 20/32
  • 11. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Object definitions Other elements Example scenario Command definition (example 2) checkcommands.cfg d e f i n e command{ command name check name for given dns command line $USER1$/ check dns −H $ARG1$ −a $ARG2$ −s $HOSTADDRESS$ # $ARG1$ : f u l l y q u a l i f i e d name # $ARG2$ : IP address } Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 21/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Object definitions Other elements Example scenario Contact definition (example) contacts.cfg d e f i n e contact { contact name andrey a l i a s l a u r e n t andrey s e r v i c e n o t i f i c a t i o n p e r i o d 24x7 h o s t n o t i f i c a t i o n p e r i o d 24x7 s e r v i c e n o t i f i c a t i o n o p t i o n s w, u , c , r h o s t n o t i f i c a t i o n o p t i o n s d , r s e r v i c e n o t i f i c a t i o n c o m m a n d s n o t i f y −by−email host notification commands host−n o t i f y −by−email email a n d r e y @ l o r i a . f r } Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 22/32
  • 12. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Object definitions Other elements Example scenario Contactgroup definition (example) contacts.cfg d e f i n e contactgroup { contactgroup name a d m i n i s t r a t o r s a l i a s group of a d m i n i s t r a t o r s members andrey } Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 23/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Local checks Remote checks Local checks Getting information about your local system Plugins based on system commands (such as ps, df, uptime) check_disk -w 30% -c 15% -p /var check_load -w 2.0,1.0,0.5 -c 4.0,2.0,1.0 check_procs -w 150 -c 250 --metric=PROCS Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 24/32
  • 13. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Local checks Remote checks Direct network checks Checking network services of remote hosts Plugins based on network protocols Directly executed from the Nagios machine check_icmp -H 1.14.1.2 -w 100.0,20% -c 200.0,40% check_tcp -H boston.loria.fr -p 7000 check_ftp -H ftp.inria.fr -p 21 -e 220 check_http -H http://www.esial.uhp-nancy.fr check_smtp, check_imap, check_pop Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 25/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Local checks Remote checks Checks using SSH (Secure Shell) check_by_ssh (plugin) sshd (daemon) check_xyz (plugin) NAGIOS CLIENT priv.key pub.key NETWORK Nagios executes the check by ssh plugin To run a plugin deployed on the remote machine Based on asymetric keys to log without typing a password Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 26/32
  • 14. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Local checks Remote checks Checks using NRPE (Nagios Remote Plugin Executor) check_nrpe (plugin) ref nrpe (daemon) check_xyz (plugin) NAGIOS CLIENT tcp connection NETWORK ref1 check_xyz arg1 arg2 ref2 check_abc arg3 arg4 ... ... Nagios executes the check nrpe plugin To interact with a dedicated daemon called nrpe Based on pre-configured plugin invocations Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 27/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Local checks Remote checks Other remote checks Checks using SNMP Collecting management information from SNMP agents check_snmp plugin ⇔ net-snmp snmpget SNMP reply + warning and critical limits ⇒ service state Checks using NSCA (Nagios Service Check Acceptor) Passive method where checks are initiated by the resources themselves (close to SNMP traps) A NSCA daemon waits for incoming check results (on the Nagios machine), while the send_nsca program (on the remote machines) sends messages containing check results Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 28/32
  • 15. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Making configuration more simple Specifying Dependencies Making configuration more simple Monitoring the same service on several hosts setting the host name attribute of the service as a comma separated list of host names or setting an hostgroup name attribute for the service Defining template-based objects Notion of inheritance Factorizing many low-interest attributes register attribute to define a template use attribute to inherite from a template Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 29/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Making configuration more simple Specifying Dependencies Service and hostgroup (example) d e f i n e hostgroup{ hostgroup name d n s h o s t s a l i a s h o s t s s u p p o r t i n g DNS members dns1 , dns2 , dnsext } d e f i n e s e r v i c e { hostgroup name d n s h o s t s s e r v i c e d e s c r i p t i o n dns s e r v i c e check command c h e c k n a m e f o r g i v e n d n s !www. l o r i a . f r ! 1 5 2 . 8 1 . 1 4 4 . 2 2 max check attempts 3 n o r m a l c h e c k i n t e r v a l 5 r e t r y c h e c k i n t e r v a l 1 c h e c k p e r i o d 24 x7 n o t i f i c a t i o n i n t e r v a l 180 n o t i f i c a t i o n p e r i o d 24 x7 n o t i f i c a t i o n o p t i o n s w, c , r , f , u c o n t a c t g r o u p s a d m i n i s t r a t o r s } Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 30/32
  • 16. Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Making configuration more simple Specifying Dependencies Specifying dependencies Remainder: service critical state ⇒ host check How does Nagios make a difference between: case 1: webloria is really down case 2: webloria is unreachable due to the network? Solution: defining dependency relationships amongst objects (parents attribute) If an host is detected as down, the parent is checked If the parent is OK, the initial host is really declared down If not, the host is declared unreachable Ex: cat-481 router defined as a parent of webloria Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 31/32 Introduction Nagios at a glance Nagios configuration Checks and their execution Advanced configurations Making configuration more simple Specifying Dependencies Conclusions Open source monitoring solution Based on simple concepts: checks, states, notifications Easily extensible and integrable No discovery mechanism Experience it during lab exercises! Monitoring hosts and their services Developing and testing our own plugin Experimenting state transitions and notifications Laurent Andrey, R´emi Badonnel An Introduction to Monitoring with Nagios 32/32