SlideShare a Scribd company logo
1 of 30
Monitoring at/with
SUSE
How SUSE R&D checks network and system
resources
Lars Vogdt – Team Lead SUSE IT
Lars.Vogdt@suse.com
2
Agenda
• Official history
• What are you using at SUSE?
• Where are the Nagios Plugins?
• SUSE R&D internal usage
• Tips & Tricks
• High available, load balanced monitoring
• Demo (if possible)
3
The short history of
Monitoring in SUSE, Part 1
• Saturday, October 23, 2001
SuSE Linux 7.3
* The first monitoring tool NetSaint version 0.0.7b6
• Monday, September 30, 2002
SuSE Linux 8.1
* Welcome Nagios in SuSE (version 1.0b4)
• Saturday, April 16, 2005
SuSE Linux 9.3
* Nagios (in SuSE v 1.2 ) was project of month on SourceForge.net.
4
The short history of
Monitoring in SUSE, Part 2
• Monday, June 18 2007
SUSE Linux Enterprise Server 10 SP1
* Nagios version 2.6 was with us until 2013
• Tuesday, March 24, 2009
SUSE Linux Enterprise Server 11
* Nagios version 3.0.6 as monitoring tool is stronger then never...
• Wednesday , April 6, 2009
Icinga forked Nagios
Icinga will be part of the next SUSE Manager release
=> Migrating from Nagios to Icinga 1.x is easy
5
What are you using at SUSE?
• SUSE Linux Enterprise Server with High
Availability Extension
• Additional packages from obs://server:monitoring,
obs://devel:languages:perl and
obs://network:telephony
• Internal packages (for no-src/legal-problematic
packages – mostly license problems…)
We release all “internal tools” in obs://server:monitoring as soon as
possible and if there are no legal reasons (which only
affects a handful of packages/scripts).
6
Where are the Nagios Plugins?
• https://bugzilla.suse.com/show_bug.cgi?id=859105
and especially
https://bugzilla.redhat.com/show_bug.cgi?id=1054340
have all details
• 2014-07-15: nagios-plugins* got renamed to
monitoring-plugins*. Since that day, we have
‒ unmaintained nagios-core-plugins* and
‒ maintained monitoring-plugins*
packages in server:monitoring
SUSE R&D internal
8
SUSE R&D “specials”
• Crazy customers (developers)
• “No need for monitoring or statistics” – until
something breaks or gets relevant for
business ???
• Multiple dual-stacked networks (IPv4 + IPv6)
separated via Firewalls
• Production vs. Development => high amount of
moving targets
• Multiple hardware vendors, even NDA hardware
without any further details or manuals
• Luckily mostly unique Operating Systems :-) but
many different services
9
The Past
Services
43 2000
15 65
12 120
70 2185
Maximal
Location Core System Addons Hosts
Nuremberg Nagios
Prague Nagios
Provo Nagios
Summary
Latency in Nuremberg Average
Host Check Latency ~7 seconds ~1.5 seconds
Service Check Latency ~8 seconds ~1 second
10
Current Situation
Services
430 4700
96 1150
140 1700
30 170
696 7720
Maximal
Location Core System Hosts
Nuremberg 1 Icinga
Nuremberg 2 Nagios
Prague Icinga
Provo Nagios
Summary
Latency in Nuremberg Average
Host Check Latency ~5 seconds ~1.5 seconds
Service Check Latency ~3 seconds ~1 second
Nuremberg Prague Provo
0
2000
4000
6000
8000
Services monitored
Past
Current
Some small tips
12
Why you always should define dependencies
13
What should be monitored?
Administrator View Business View
Hardware health Service health
Service availability – host based Service availability – business based
Overview about the services and
incidents of single hosts
Overview about the final business
impact, not the service components
Only important for Administrators Important for Managers and
Customers
14
What can be checked?
Nearly everything is possible!
Minimal requirements listed below:
Your script returns one of the following Exit-Codes:
3 : UnknownUnknown – something outside the normal control
range (of your script?) happened
2 : Something criticalcritical happend! Help needed!
1 : well, it works currently – but be warnedwarned
0 : everything okok
Some (human readable) output on STDOUT would be nice, but is not
necessary for Nagios or Icinga itself.
Print performance data on STDOUT, separated from normal output via '|'
https://nagios-plugins.org/doc/guidelines.html.
15
Example check: check_file_exists
16
Eventhandlers
If a service or host is in a defined, unwanted state, trigger external
scripts to “solve” the problem automatically.
(Restart apache if it crashes, send SMS if nobody acknowledges a problem, shutdown
all OBS workers if Lars is not available, …)
17
Monitoring SANBoxes with MRTG
For Qlogic, run the following command on your MRTG machine:
/usr/bin/cfgmaker --global "WorkDir: /srv/www/htdocs/mrtg"
--global "Options[_]: growright, bits, unknaszero"
--ifdesc=alias,name --ifref=name --noreversedns --no-down
--show-op-down --subdirs=sanbox-1 –output=/etc/mrtg/sanbox-
1.conf --snmp-options=:::::2 192.168.0.1
...or for Cisco MD:
/usr/bin/cfgmaker --global "WorkDir: /srv/www/htdocs/mrtg"
--global "Options[_]: growright, bits, unknaszero"
--ifdesc=alias --noreversedns --no-down --show-op-down –
subdirs=sanbox-2 –output=sanbox-2.conf --snmp-options=:::::2
192.168.0.2
18
Monitoring IO on your machines
On the machine your want to monitor:
• Install monitoring-plugins-sar-perf
• Prepare a command like (NRPE example):
command[check_iostat_home]=/usr/lib/nagios/plugins/check_iost
at -d root-fs_home -w 120000,120000,120000 -c
150000,150000,150000 -W 30 -C 50
• Maybe also enable sysstat (chkconfig boot.sysstat on), to have
the data available on the host directly
19
MRTG graphs for network interfaces
of virtual machines
On the Server running the virtual machines, edit /etc/snmp/snmpd.conf :
[...]
rocommunity public 10.0.0.0/16
[...]
On your MRTG machine, run:
/usr/bin/cfgmaker --global "WorkDir:
/srv/www/htdocs/mrtg" --global "Options[_]:
growright, bits, unknaszero" --ifdesc=alias,name
--ifref=name --noreversedns --no-down --show-op-down
--subdirs=vmserv1 --output=vmserv1.conf --snmp-
options=:::::2 10.0.0.101
...and edit the xml definition of your virtual machine:
<interface type='bridge'>
[...]
<target dev='vm1'/>
[...]
</interface>
Now (re-)start snmpd and your virtual machine.
20
Monitoring of MySQL servers
We are currently using two different checks:
• check_mysql (monitoring-plugins-mysql package)
• check_mysql_health (monitoring-plugins-mysql_health package)
You need a database user with "SELECT" access for both options. Usually, this
means that you create a user named "nagios" in MySQL:
mysql> GRANT SELECT on nagios.* TO 'nagios'@'localhost' IDENTIFIED
BY 'nag1os';
mysql> flush privileges;
mysql> quit
Afterward you should be able to check the database via:
/usr/lib/nagios/plugins/check_mysql -H $HOST -u §USER -p $PASS
or:
/usr/lib/nagios/plugins/check_mysql_health --units MB -–mode 
threads-connected --username $USER --password $PASS 
--warning 40 --critical 50
21
Monitoring of PostgreSQL
• check the file pg_hba.conf on the database server to contain
the correct IP addresses of the monitoring cluster
• create the monitor user via the createuser command as user
postgres:
postgres@pg1:~> createuser --pwprompt --interactive monitor
Enter password for new role:
Enter it again:
Shall the new role be a superuser? (y/n) y
Shall the new role be allowed to create databases? (y/n) n
Shall the new role be allowed to create more new roles?
(y/n) n
• Note: the SUPERUSER privilege is needed for some special
checks like "archive_ready".
• restart the database
• Try on the monitoring cluster:
~> ./check_postgres.pl --dbpass=$PASSWORD –dbuser=$USERNAME 
--action=archive_ready -H pg1
POSTGRES_ARCHIVE_READY OK: DB "postgres" (host:pg1) WAL
".ready" files found: 0 | time=0.02s files=0;10;15
22
...and there is more...
• More and more monitoring-plugins* packages come with enabled Apparmor
profiles: check /var/log/audit/audit.log if something seems to be crazy
• Re-enable notifications automatically via cron – to not forget it:
#!/bin/bash
CFG=/etc/icinga/icinga.cfg
commandfile=$(grep ^command_file "$CFG" | awk -F'=' '{ print $2 }')
if [ -p "$commandfile" ]; then
now=`date +%s`
printf "[%lu] ENABLE_NOTIFICATIONSn" $now > "$commandfile"
fi
• Monitor your NSCA daemon via monitoring-plugins-nsca and a dummy test (see
README)
• Create performance data for your monitoring:
#!/bin/bash
if /etc/init.d/icinga status >/dev/null 2>/dev/null ; then
if [ -p /var/run/icinga/icinga.cmd ]; then
su – icinga -c "/usr/lib/nagios/plugins/check_nagiostats
--EXEC /usr/sbin/icingastats --passive $HOST 
icingastats >> /var/run/icinga/icinga.cmd"
fi
fi
• Monitor your monitoring setup!
High available, load balanced
monitoring
24
Basic overview
• Corosync Pacemaker Cluster (two main machines
+ one VM just for Quorum) – using IPMI for
STONITH
• DRBD to provide storage (PNP, Logs) on both
main machines
• Services like MySQL (cluster), snmptrapd or
NSCA run “unmanaged” on all nodes
• mod_gearman for Load-Balancing/Failover of
normal checks
• check_mk for automatic checks and Load-
Reducing
• MRTG for statistics from Network and SAN (for
historical reasons)
25
Load-Balanced / HA Monitoring in
project pictures
Livestatus
snmptt snmptt
Demo time
Questions?
Thank you.
Join the conversation,
contribute & have a lot of fun!
www.opensuse.org
29
Have a Lot of Fun, and Join Us
At:
www.opensuse.org
General Disclaimer
This document is not to be construed as a promise by any participating organisation to
develop, deliver, or market a product. It is not a commitment to deliver any material, code,
or functionality, and should not be relied upon in making purchasing decisions. openSUSE
makes no representations or warranties with respect to the contents of this document, and
specifically disclaims any express or implied warranties of merchantability or fitness for
any particular purpose. The development, release, and timing of features or functionality
described for openSUSE products remains at the sole discretion of openSUSE. Further,
openSUSE reserves the right to revise this document and to make changes to its content,
at any time, without obligation to notify any person or entity of such revisions or changes.
All openSUSE marks referenced in this presentation are trademarks or registered
trademarks of SUSE LLC, in the United States and other countries. All third-party
trademarks are the property of their respective owners.
License
This slide deck is licensed under the Creative Commons Attribution-ShareAlike 4.0
International license. It can be shared and adapted for any purpose (even
commercially) as long as Attribution is given and any derivative work is distributed
under the same license.
Details can be found at https://creativecommons.org/licenses/by-sa/4.0/
Credits
Template
Richard Brown
rbrown@opensuse.org
Design & Inspiration
openSUSE Design Team
http://opensuse.github.io/branding-
guidelines/

More Related Content

What's hot

Containers with systemd-nspawn
Containers with systemd-nspawnContainers with systemd-nspawn
Containers with systemd-nspawnGábor Nyers
 
Your first dive into systemd!
Your first dive into systemd!Your first dive into systemd!
Your first dive into systemd!Etsuji Nakai
 
Install elasticsearch, logstash and kibana
Install elasticsearch, logstash and kibana Install elasticsearch, logstash and kibana
Install elasticsearch, logstash and kibana Chanaka Lasantha
 
Full system roll-back and systemd in SUSE Linux Enterprise 12
Full system roll-back and systemd in SUSE Linux Enterprise 12Full system roll-back and systemd in SUSE Linux Enterprise 12
Full system roll-back and systemd in SUSE Linux Enterprise 12Gábor Nyers
 
LISA17 Container Performance Analysis
LISA17 Container Performance AnalysisLISA17 Container Performance Analysis
LISA17 Container Performance AnalysisBrendan Gregg
 
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Zabbix
 
OSMC 2011 | Distributed monitoring using NSClient++ by Michael Medin
OSMC 2011 | Distributed monitoring using NSClient++ by Michael MedinOSMC 2011 | Distributed monitoring using NSClient++ by Michael Medin
OSMC 2011 | Distributed monitoring using NSClient++ by Michael MedinNETWAYS
 
Linux con europe_2014_f
Linux con europe_2014_fLinux con europe_2014_f
Linux con europe_2014_fsprdd
 
CLUG 2010 09 - systemd - the new init system
CLUG 2010 09 - systemd - the new init systemCLUG 2010 09 - systemd - the new init system
CLUG 2010 09 - systemd - the new init systemPaulWay
 
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...NETWAYS
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbWei Shan Ang
 
Percona XtraDB 集群内部
Percona XtraDB 集群内部Percona XtraDB 集群内部
Percona XtraDB 集群内部YUCHENG HU
 
Rear automated testing with Bareos
Rear automated testing with BareosRear automated testing with Bareos
Rear automated testing with BareosGratien D'haese
 
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)Wei Shan Ang
 
Spider Setup with AWS/sandbox
Spider Setup with AWS/sandboxSpider Setup with AWS/sandbox
Spider Setup with AWS/sandboxI Goo Lee
 
OpenStack networking-sfc flow 분석
OpenStack networking-sfc flow 분석OpenStack networking-sfc flow 분석
OpenStack networking-sfc flow 분석Yongyoon Shin
 
libreCMC : The Libre Embedded GNU/Linux Distro
libreCMC : The Libre Embedded GNU/Linux DistrolibreCMC : The Libre Embedded GNU/Linux Distro
libreCMC : The Libre Embedded GNU/Linux DistroAll Things Open
 
MySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines SlidesMySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines SlidesSeveralnines
 

What's hot (20)

Containers with systemd-nspawn
Containers with systemd-nspawnContainers with systemd-nspawn
Containers with systemd-nspawn
 
Basic onos-tutorial
Basic onos-tutorialBasic onos-tutorial
Basic onos-tutorial
 
Your first dive into systemd!
Your first dive into systemd!Your first dive into systemd!
Your first dive into systemd!
 
Install elasticsearch, logstash and kibana
Install elasticsearch, logstash and kibana Install elasticsearch, logstash and kibana
Install elasticsearch, logstash and kibana
 
Full system roll-back and systemd in SUSE Linux Enterprise 12
Full system roll-back and systemd in SUSE Linux Enterprise 12Full system roll-back and systemd in SUSE Linux Enterprise 12
Full system roll-back and systemd in SUSE Linux Enterprise 12
 
LISA17 Container Performance Analysis
LISA17 Container Performance AnalysisLISA17 Container Performance Analysis
LISA17 Container Performance Analysis
 
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
 
Kernel crashdump
Kernel crashdumpKernel crashdump
Kernel crashdump
 
OSMC 2011 | Distributed monitoring using NSClient++ by Michael Medin
OSMC 2011 | Distributed monitoring using NSClient++ by Michael MedinOSMC 2011 | Distributed monitoring using NSClient++ by Michael Medin
OSMC 2011 | Distributed monitoring using NSClient++ by Michael Medin
 
Linux con europe_2014_f
Linux con europe_2014_fLinux con europe_2014_f
Linux con europe_2014_f
 
CLUG 2010 09 - systemd - the new init system
CLUG 2010 09 - systemd - the new init systemCLUG 2010 09 - systemd - the new init system
CLUG 2010 09 - systemd - the new init system
 
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
OSMC 2009 | Windows monitoring - Going where no man has gone before... by Mic...
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
 
Percona XtraDB 集群内部
Percona XtraDB 集群内部Percona XtraDB 集群内部
Percona XtraDB 集群内部
 
Rear automated testing with Bareos
Rear automated testing with BareosRear automated testing with Bareos
Rear automated testing with Bareos
 
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
 
Spider Setup with AWS/sandbox
Spider Setup with AWS/sandboxSpider Setup with AWS/sandbox
Spider Setup with AWS/sandbox
 
OpenStack networking-sfc flow 분석
OpenStack networking-sfc flow 분석OpenStack networking-sfc flow 분석
OpenStack networking-sfc flow 분석
 
libreCMC : The Libre Embedded GNU/Linux Distro
libreCMC : The Libre Embedded GNU/Linux DistrolibreCMC : The Libre Embedded GNU/Linux Distro
libreCMC : The Libre Embedded GNU/Linux Distro
 
MySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines SlidesMySQL Cluster 7.3 Performance Tuning - Severalnines Slides
MySQL Cluster 7.3 Performance Tuning - Severalnines Slides
 

Viewers also liked

4 red hat_minsk_june_25_2015
4 red hat_minsk_june_25_20154 red hat_minsk_june_25_2015
4 red hat_minsk_june_25_2015trenders
 
openSUSE Infrastructure 2015
openSUSE Infrastructure 2015openSUSE Infrastructure 2015
openSUSE Infrastructure 2015Lars Vogdt
 
6 qualys minsk_june_25_2015
6 qualys minsk_june_25_20156 qualys minsk_june_25_2015
6 qualys minsk_june_25_2015trenders
 
3 hp minsk_june_25_2015
3 hp minsk_june_25_20153 hp minsk_june_25_2015
3 hp minsk_june_25_2015trenders
 
Hpc Server 2008 Ecosystem
Hpc Server 2008 EcosystemHpc Server 2008 Ecosystem
Hpc Server 2008 EcosystemOleg Nazarevych
 
Bridging openSUSE and SLE gap: the GNOME example
Bridging openSUSE and SLE gap: the GNOME exampleBridging openSUSE and SLE gap: the GNOME example
Bridging openSUSE and SLE gap: the GNOME exampleFrederic Crozat
 
Advantages of SUSE Linux Over Windows
Advantages of SUSE Linux Over WindowsAdvantages of SUSE Linux Over Windows
Advantages of SUSE Linux Over WindowsJeff Reser
 
Building IT with Precision - SUSE Linux Enterprise Real Time
Building IT with Precision - SUSE Linux Enterprise Real TimeBuilding IT with Precision - SUSE Linux Enterprise Real Time
Building IT with Precision - SUSE Linux Enterprise Real TimeJeff Reser
 
Opti x rtn 910950980 hardware description wind
Opti x rtn 910950980 hardware description windOpti x rtn 910950980 hardware description wind
Opti x rtn 910950980 hardware description windnctgayaranga
 
02 opti x rtn 900 v100r002 system hardware-20100223-a
02 opti x rtn 900 v100r002 system hardware-20100223-a02 opti x rtn 900 v100r002 system hardware-20100223-a
02 opti x rtn 900 v100r002 system hardware-20100223-aWaheed Ali
 
02 opti x rtn 900 v100r002 configuration guide-20100119-a
02 opti x rtn 900 v100r002 configuration guide-20100119-a02 opti x rtn 900 v100r002 configuration guide-20100119-a
02 opti x rtn 900 v100r002 configuration guide-20100119-aWaheed Ali
 
Mw training slide
Mw training slideMw training slide
Mw training slidededoyin
 
The Huawei Node B Evolution
The Huawei Node B EvolutionThe Huawei Node B Evolution
The Huawei Node B EvolutionAtif Mahmood
 
Guide to open suse 13.2 by mustafa rasheed abass & abdullah t. tua'ama..super...
Guide to open suse 13.2 by mustafa rasheed abass & abdullah t. tua'ama..super...Guide to open suse 13.2 by mustafa rasheed abass & abdullah t. tua'ama..super...
Guide to open suse 13.2 by mustafa rasheed abass & abdullah t. tua'ama..super...Mustafa AL-Timemmie
 
Linux command ppt
Linux command pptLinux command ppt
Linux command pptkalyanineve
 
Digital microwave communication principles
Digital microwave communication principles Digital microwave communication principles
Digital microwave communication principles Mohamed Sewailam
 
Creating HTML Pages
Creating HTML PagesCreating HTML Pages
Creating HTML PagesMike Crabb
 

Viewers also liked (20)

4 red hat_minsk_june_25_2015
4 red hat_minsk_june_25_20154 red hat_minsk_june_25_2015
4 red hat_minsk_june_25_2015
 
openSUSE Infrastructure 2015
openSUSE Infrastructure 2015openSUSE Infrastructure 2015
openSUSE Infrastructure 2015
 
Siae datasheet
Siae datasheetSiae datasheet
Siae datasheet
 
6 qualys minsk_june_25_2015
6 qualys minsk_june_25_20156 qualys minsk_june_25_2015
6 qualys minsk_june_25_2015
 
3 hp minsk_june_25_2015
3 hp minsk_june_25_20153 hp minsk_june_25_2015
3 hp minsk_june_25_2015
 
Hpc Server 2008 Ecosystem
Hpc Server 2008 EcosystemHpc Server 2008 Ecosystem
Hpc Server 2008 Ecosystem
 
Bridging openSUSE and SLE gap: the GNOME example
Bridging openSUSE and SLE gap: the GNOME exampleBridging openSUSE and SLE gap: the GNOME example
Bridging openSUSE and SLE gap: the GNOME example
 
Advantages of SUSE Linux Over Windows
Advantages of SUSE Linux Over WindowsAdvantages of SUSE Linux Over Windows
Advantages of SUSE Linux Over Windows
 
Building IT with Precision - SUSE Linux Enterprise Real Time
Building IT with Precision - SUSE Linux Enterprise Real TimeBuilding IT with Precision - SUSE Linux Enterprise Real Time
Building IT with Precision - SUSE Linux Enterprise Real Time
 
Opti x rtn 910950980 hardware description wind
Opti x rtn 910950980 hardware description windOpti x rtn 910950980 hardware description wind
Opti x rtn 910950980 hardware description wind
 
02 opti x rtn 900 v100r002 system hardware-20100223-a
02 opti x rtn 900 v100r002 system hardware-20100223-a02 opti x rtn 900 v100r002 system hardware-20100223-a
02 opti x rtn 900 v100r002 system hardware-20100223-a
 
02 opti x rtn 900 v100r002 configuration guide-20100119-a
02 opti x rtn 900 v100r002 configuration guide-20100119-a02 opti x rtn 900 v100r002 configuration guide-20100119-a
02 opti x rtn 900 v100r002 configuration guide-20100119-a
 
Linux commands
Linux commandsLinux commands
Linux commands
 
Mw training slide
Mw training slideMw training slide
Mw training slide
 
The Huawei Node B Evolution
The Huawei Node B EvolutionThe Huawei Node B Evolution
The Huawei Node B Evolution
 
Guide to open suse 13.2 by mustafa rasheed abass & abdullah t. tua'ama..super...
Guide to open suse 13.2 by mustafa rasheed abass & abdullah t. tua'ama..super...Guide to open suse 13.2 by mustafa rasheed abass & abdullah t. tua'ama..super...
Guide to open suse 13.2 by mustafa rasheed abass & abdullah t. tua'ama..super...
 
Linux command ppt
Linux command pptLinux command ppt
Linux command ppt
 
Digital microwave communication principles
Digital microwave communication principles Digital microwave communication principles
Digital microwave communication principles
 
GSM fundamentals (Huawei)
GSM fundamentals (Huawei)GSM fundamentals (Huawei)
GSM fundamentals (Huawei)
 
Creating HTML Pages
Creating HTML PagesCreating HTML Pages
Creating HTML Pages
 

Similar to Monitoring at/with SUSE 2015

PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpNathan Handler
 
Why favour Icinga over Nagios - Rootconf 2015
Why favour Icinga over Nagios - Rootconf 2015Why favour Icinga over Nagios - Rootconf 2015
Why favour Icinga over Nagios - Rootconf 2015Icinga
 
Why favour Icinga over Nagios @ FrOSCon 2015
Why favour Icinga over Nagios @ FrOSCon 2015Why favour Icinga over Nagios @ FrOSCon 2015
Why favour Icinga over Nagios @ FrOSCon 2015Icinga
 
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...Nagios
 
OpenNebula 5.4 Hands-on Tutorial
OpenNebula 5.4 Hands-on TutorialOpenNebula 5.4 Hands-on Tutorial
OpenNebula 5.4 Hands-on TutorialOpenNebula Project
 
Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014Tomas Doran
 
OSDC 2015: Bernd Erk | Why favour Icinga over Nagios
OSDC 2015: Bernd Erk | Why favour Icinga over NagiosOSDC 2015: Bernd Erk | Why favour Icinga over Nagios
OSDC 2015: Bernd Erk | Why favour Icinga over NagiosNETWAYS
 
Why favor Icinga over Nagios @ DebConf15
Why favor Icinga over Nagios @ DebConf15Why favor Icinga over Nagios @ DebConf15
Why favor Icinga over Nagios @ DebConf15Icinga
 
Automating complex infrastructures with Puppet
Automating complex infrastructures with PuppetAutomating complex infrastructures with Puppet
Automating complex infrastructures with PuppetKris Buytaert
 
Automating Complex Setups with Puppet
Automating Complex Setups with PuppetAutomating Complex Setups with Puppet
Automating Complex Setups with PuppetKris Buytaert
 
Secure lustre on openstack
Secure lustre on openstackSecure lustre on openstack
Secure lustre on openstackJames Beal
 
Salting new ground one man ops from scratch
Salting new ground   one man ops from scratchSalting new ground   one man ops from scratch
Salting new ground one man ops from scratchJay Harrison
 
Linux Server Deep Dives (DrupalCon Amsterdam)
Linux Server Deep Dives (DrupalCon Amsterdam)Linux Server Deep Dives (DrupalCon Amsterdam)
Linux Server Deep Dives (DrupalCon Amsterdam)Amin Astaneh
 
Infrastructure review - Shining a light on the Black Box
Infrastructure review - Shining a light on the Black BoxInfrastructure review - Shining a light on the Black Box
Infrastructure review - Shining a light on the Black BoxMiklos Szel
 
Dave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceDave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceNagios
 
Spectre meltdown performance_tests - v0.3
Spectre meltdown performance_tests - v0.3Spectre meltdown performance_tests - v0.3
Spectre meltdown performance_tests - v0.3David Pasek
 

Similar to Monitoring at/with SUSE 2015 (20)

PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at Yelp
 
Why favour Icinga over Nagios - Rootconf 2015
Why favour Icinga over Nagios - Rootconf 2015Why favour Icinga over Nagios - Rootconf 2015
Why favour Icinga over Nagios - Rootconf 2015
 
Why favour Icinga over Nagios @ FrOSCon 2015
Why favour Icinga over Nagios @ FrOSCon 2015Why favour Icinga over Nagios @ FrOSCon 2015
Why favour Icinga over Nagios @ FrOSCon 2015
 
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
 
OpenNebula 5.4 Hands-on Tutorial
OpenNebula 5.4 Hands-on TutorialOpenNebula 5.4 Hands-on Tutorial
OpenNebula 5.4 Hands-on Tutorial
 
Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014
 
OSDC 2015: Bernd Erk | Why favour Icinga over Nagios
OSDC 2015: Bernd Erk | Why favour Icinga over NagiosOSDC 2015: Bernd Erk | Why favour Icinga over Nagios
OSDC 2015: Bernd Erk | Why favour Icinga over Nagios
 
Why favor Icinga over Nagios @ DebConf15
Why favor Icinga over Nagios @ DebConf15Why favor Icinga over Nagios @ DebConf15
Why favor Icinga over Nagios @ DebConf15
 
Automating complex infrastructures with Puppet
Automating complex infrastructures with PuppetAutomating complex infrastructures with Puppet
Automating complex infrastructures with Puppet
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 
Automating Complex Setups with Puppet
Automating Complex Setups with PuppetAutomating Complex Setups with Puppet
Automating Complex Setups with Puppet
 
Beyond Puppet
Beyond PuppetBeyond Puppet
Beyond Puppet
 
Secure lustre on openstack
Secure lustre on openstackSecure lustre on openstack
Secure lustre on openstack
 
Nagios intro
Nagios intro Nagios intro
Nagios intro
 
Salting new ground one man ops from scratch
Salting new ground   one man ops from scratchSalting new ground   one man ops from scratch
Salting new ground one man ops from scratch
 
Linux Hardening - nullhyd
Linux Hardening - nullhydLinux Hardening - nullhyd
Linux Hardening - nullhyd
 
Linux Server Deep Dives (DrupalCon Amsterdam)
Linux Server Deep Dives (DrupalCon Amsterdam)Linux Server Deep Dives (DrupalCon Amsterdam)
Linux Server Deep Dives (DrupalCon Amsterdam)
 
Infrastructure review - Shining a light on the Black Box
Infrastructure review - Shining a light on the Black BoxInfrastructure review - Shining a light on the Black Box
Infrastructure review - Shining a light on the Black Box
 
Dave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceDave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical Experience
 
Spectre meltdown performance_tests - v0.3
Spectre meltdown performance_tests - v0.3Spectre meltdown performance_tests - v0.3
Spectre meltdown performance_tests - v0.3
 

Recently uploaded

Anne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptxAnne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptxnoorehahmad
 
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC  - NANOTECHNOLOGYPHYSICS PROJECT BY MSC  - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC - NANOTECHNOLOGYpruthirajnayak525
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfhenrik385807
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...NETWAYS
 
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxmavinoikein
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...NETWAYS
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...NETWAYS
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...NETWAYS
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Krijn Poppe
 
Genshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxGenshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxJohnree4
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Escort Service
 
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...marjmae69
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella
 
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Salam Al-Karadaghi
 
James Joyce, Dubliners and Ulysses.ppt !
James Joyce, Dubliners and Ulysses.ppt !James Joyce, Dubliners and Ulysses.ppt !
James Joyce, Dubliners and Ulysses.ppt !risocarla2016
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...NETWAYS
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxaryanv1753
 

Recently uploaded (20)

Anne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptxAnne Frank A Beacon of Hope amidst darkness ppt.pptx
Anne Frank A Beacon of Hope amidst darkness ppt.pptx
 
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC  - NANOTECHNOLOGYPHYSICS PROJECT BY MSC  - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
 
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptx
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
 
Genshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxGenshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptx
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170
 
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software Engineering
 
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
 
James Joyce, Dubliners and Ulysses.ppt !
James Joyce, Dubliners and Ulysses.ppt !James Joyce, Dubliners and Ulysses.ppt !
James Joyce, Dubliners and Ulysses.ppt !
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
 
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular PlasticsDutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptx
 

Monitoring at/with SUSE 2015

  • 1. Monitoring at/with SUSE How SUSE R&D checks network and system resources Lars Vogdt – Team Lead SUSE IT Lars.Vogdt@suse.com
  • 2. 2 Agenda • Official history • What are you using at SUSE? • Where are the Nagios Plugins? • SUSE R&D internal usage • Tips & Tricks • High available, load balanced monitoring • Demo (if possible)
  • 3. 3 The short history of Monitoring in SUSE, Part 1 • Saturday, October 23, 2001 SuSE Linux 7.3 * The first monitoring tool NetSaint version 0.0.7b6 • Monday, September 30, 2002 SuSE Linux 8.1 * Welcome Nagios in SuSE (version 1.0b4) • Saturday, April 16, 2005 SuSE Linux 9.3 * Nagios (in SuSE v 1.2 ) was project of month on SourceForge.net.
  • 4. 4 The short history of Monitoring in SUSE, Part 2 • Monday, June 18 2007 SUSE Linux Enterprise Server 10 SP1 * Nagios version 2.6 was with us until 2013 • Tuesday, March 24, 2009 SUSE Linux Enterprise Server 11 * Nagios version 3.0.6 as monitoring tool is stronger then never... • Wednesday , April 6, 2009 Icinga forked Nagios Icinga will be part of the next SUSE Manager release => Migrating from Nagios to Icinga 1.x is easy
  • 5. 5 What are you using at SUSE? • SUSE Linux Enterprise Server with High Availability Extension • Additional packages from obs://server:monitoring, obs://devel:languages:perl and obs://network:telephony • Internal packages (for no-src/legal-problematic packages – mostly license problems…) We release all “internal tools” in obs://server:monitoring as soon as possible and if there are no legal reasons (which only affects a handful of packages/scripts).
  • 6. 6 Where are the Nagios Plugins? • https://bugzilla.suse.com/show_bug.cgi?id=859105 and especially https://bugzilla.redhat.com/show_bug.cgi?id=1054340 have all details • 2014-07-15: nagios-plugins* got renamed to monitoring-plugins*. Since that day, we have ‒ unmaintained nagios-core-plugins* and ‒ maintained monitoring-plugins* packages in server:monitoring
  • 8. 8 SUSE R&D “specials” • Crazy customers (developers) • “No need for monitoring or statistics” – until something breaks or gets relevant for business ??? • Multiple dual-stacked networks (IPv4 + IPv6) separated via Firewalls • Production vs. Development => high amount of moving targets • Multiple hardware vendors, even NDA hardware without any further details or manuals • Luckily mostly unique Operating Systems :-) but many different services
  • 9. 9 The Past Services 43 2000 15 65 12 120 70 2185 Maximal Location Core System Addons Hosts Nuremberg Nagios Prague Nagios Provo Nagios Summary Latency in Nuremberg Average Host Check Latency ~7 seconds ~1.5 seconds Service Check Latency ~8 seconds ~1 second
  • 10. 10 Current Situation Services 430 4700 96 1150 140 1700 30 170 696 7720 Maximal Location Core System Hosts Nuremberg 1 Icinga Nuremberg 2 Nagios Prague Icinga Provo Nagios Summary Latency in Nuremberg Average Host Check Latency ~5 seconds ~1.5 seconds Service Check Latency ~3 seconds ~1 second Nuremberg Prague Provo 0 2000 4000 6000 8000 Services monitored Past Current
  • 12. 12 Why you always should define dependencies
  • 13. 13 What should be monitored? Administrator View Business View Hardware health Service health Service availability – host based Service availability – business based Overview about the services and incidents of single hosts Overview about the final business impact, not the service components Only important for Administrators Important for Managers and Customers
  • 14. 14 What can be checked? Nearly everything is possible! Minimal requirements listed below: Your script returns one of the following Exit-Codes: 3 : UnknownUnknown – something outside the normal control range (of your script?) happened 2 : Something criticalcritical happend! Help needed! 1 : well, it works currently – but be warnedwarned 0 : everything okok Some (human readable) output on STDOUT would be nice, but is not necessary for Nagios or Icinga itself. Print performance data on STDOUT, separated from normal output via '|' https://nagios-plugins.org/doc/guidelines.html.
  • 16. 16 Eventhandlers If a service or host is in a defined, unwanted state, trigger external scripts to “solve” the problem automatically. (Restart apache if it crashes, send SMS if nobody acknowledges a problem, shutdown all OBS workers if Lars is not available, …)
  • 17. 17 Monitoring SANBoxes with MRTG For Qlogic, run the following command on your MRTG machine: /usr/bin/cfgmaker --global "WorkDir: /srv/www/htdocs/mrtg" --global "Options[_]: growright, bits, unknaszero" --ifdesc=alias,name --ifref=name --noreversedns --no-down --show-op-down --subdirs=sanbox-1 –output=/etc/mrtg/sanbox- 1.conf --snmp-options=:::::2 192.168.0.1 ...or for Cisco MD: /usr/bin/cfgmaker --global "WorkDir: /srv/www/htdocs/mrtg" --global "Options[_]: growright, bits, unknaszero" --ifdesc=alias --noreversedns --no-down --show-op-down – subdirs=sanbox-2 –output=sanbox-2.conf --snmp-options=:::::2 192.168.0.2
  • 18. 18 Monitoring IO on your machines On the machine your want to monitor: • Install monitoring-plugins-sar-perf • Prepare a command like (NRPE example): command[check_iostat_home]=/usr/lib/nagios/plugins/check_iost at -d root-fs_home -w 120000,120000,120000 -c 150000,150000,150000 -W 30 -C 50 • Maybe also enable sysstat (chkconfig boot.sysstat on), to have the data available on the host directly
  • 19. 19 MRTG graphs for network interfaces of virtual machines On the Server running the virtual machines, edit /etc/snmp/snmpd.conf : [...] rocommunity public 10.0.0.0/16 [...] On your MRTG machine, run: /usr/bin/cfgmaker --global "WorkDir: /srv/www/htdocs/mrtg" --global "Options[_]: growright, bits, unknaszero" --ifdesc=alias,name --ifref=name --noreversedns --no-down --show-op-down --subdirs=vmserv1 --output=vmserv1.conf --snmp- options=:::::2 10.0.0.101 ...and edit the xml definition of your virtual machine: <interface type='bridge'> [...] <target dev='vm1'/> [...] </interface> Now (re-)start snmpd and your virtual machine.
  • 20. 20 Monitoring of MySQL servers We are currently using two different checks: • check_mysql (monitoring-plugins-mysql package) • check_mysql_health (monitoring-plugins-mysql_health package) You need a database user with "SELECT" access for both options. Usually, this means that you create a user named "nagios" in MySQL: mysql> GRANT SELECT on nagios.* TO 'nagios'@'localhost' IDENTIFIED BY 'nag1os'; mysql> flush privileges; mysql> quit Afterward you should be able to check the database via: /usr/lib/nagios/plugins/check_mysql -H $HOST -u §USER -p $PASS or: /usr/lib/nagios/plugins/check_mysql_health --units MB -–mode threads-connected --username $USER --password $PASS --warning 40 --critical 50
  • 21. 21 Monitoring of PostgreSQL • check the file pg_hba.conf on the database server to contain the correct IP addresses of the monitoring cluster • create the monitor user via the createuser command as user postgres: postgres@pg1:~> createuser --pwprompt --interactive monitor Enter password for new role: Enter it again: Shall the new role be a superuser? (y/n) y Shall the new role be allowed to create databases? (y/n) n Shall the new role be allowed to create more new roles? (y/n) n • Note: the SUPERUSER privilege is needed for some special checks like "archive_ready". • restart the database • Try on the monitoring cluster: ~> ./check_postgres.pl --dbpass=$PASSWORD –dbuser=$USERNAME --action=archive_ready -H pg1 POSTGRES_ARCHIVE_READY OK: DB "postgres" (host:pg1) WAL ".ready" files found: 0 | time=0.02s files=0;10;15
  • 22. 22 ...and there is more... • More and more monitoring-plugins* packages come with enabled Apparmor profiles: check /var/log/audit/audit.log if something seems to be crazy • Re-enable notifications automatically via cron – to not forget it: #!/bin/bash CFG=/etc/icinga/icinga.cfg commandfile=$(grep ^command_file "$CFG" | awk -F'=' '{ print $2 }') if [ -p "$commandfile" ]; then now=`date +%s` printf "[%lu] ENABLE_NOTIFICATIONSn" $now > "$commandfile" fi • Monitor your NSCA daemon via monitoring-plugins-nsca and a dummy test (see README) • Create performance data for your monitoring: #!/bin/bash if /etc/init.d/icinga status >/dev/null 2>/dev/null ; then if [ -p /var/run/icinga/icinga.cmd ]; then su – icinga -c "/usr/lib/nagios/plugins/check_nagiostats --EXEC /usr/sbin/icingastats --passive $HOST icingastats >> /var/run/icinga/icinga.cmd" fi fi • Monitor your monitoring setup!
  • 23. High available, load balanced monitoring
  • 24. 24 Basic overview • Corosync Pacemaker Cluster (two main machines + one VM just for Quorum) – using IPMI for STONITH • DRBD to provide storage (PNP, Logs) on both main machines • Services like MySQL (cluster), snmptrapd or NSCA run “unmanaged” on all nodes • mod_gearman for Load-Balancing/Failover of normal checks • check_mk for automatic checks and Load- Reducing • MRTG for statistics from Network and SAN (for historical reasons)
  • 25. 25 Load-Balanced / HA Monitoring in project pictures Livestatus snmptt snmptt
  • 28. Thank you. Join the conversation, contribute & have a lot of fun! www.opensuse.org
  • 29. 29 Have a Lot of Fun, and Join Us At: www.opensuse.org
  • 30. General Disclaimer This document is not to be construed as a promise by any participating organisation to develop, deliver, or market a product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. openSUSE makes no representations or warranties with respect to the contents of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The development, release, and timing of features or functionality described for openSUSE products remains at the sole discretion of openSUSE. Further, openSUSE reserves the right to revise this document and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. All openSUSE marks referenced in this presentation are trademarks or registered trademarks of SUSE LLC, in the United States and other countries. All third-party trademarks are the property of their respective owners. License This slide deck is licensed under the Creative Commons Attribution-ShareAlike 4.0 International license. It can be shared and adapted for any purpose (even commercially) as long as Attribution is given and any derivative work is distributed under the same license. Details can be found at https://creativecommons.org/licenses/by-sa/4.0/ Credits Template Richard Brown rbrown@opensuse.org Design & Inspiration openSUSE Design Team http://opensuse.github.io/branding- guidelines/

Editor's Notes

  1. ITIL draws your focus to the business/service side