SlideShare a Scribd company logo
1 of 62
Download to read offline
Graphing and Trending
                                    in Nagios
                                         Matthew Wall
                                   mwall@users.sourceforge.net
                                        September 2011



                                               v0.6


Wednesday, 28 September 2011
Agenda




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
                •      What is the problem?

                •      What should a trending system do?

                •      What are the parts?

                •      What options are available?

                •      What issues need to be considered?



Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Background




                                                                                                     v0.6 ©2011 Matthew Wall, all rights reserved
                            •   Small Nagios installations with 40-80 hosts and 500-2000 services
        Nagios Experience



                            •   Small businesses with 10-20 servers and 20-40 workstations
                            •   Continuous build environments with 30+ virtual machines
                            •   Power, water, septic, and weather monitoring on an island in Maine
                            •   Databases and ticketing system for pop singer


                            •   Design optimization and supply chain optimization
        Day Job




                            •   Budget: low
        Context




                            •   Costs: time is not free
                            •   Training: ok for expert to setup, not ok for expert to operate
                            •   Hack Factor: rather high
Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
What are the options?




                                                                                                              v0.6 ©2011 Matthew Wall, all rights reserved
                •      nagiosgraph
                       1.4.4 2011-01-16
                       http://nagiosgraph.sourceforge.net/
                                                                            •   cacti
                •      nagiosgrapher
                       1.7.1 2008-12-18
                                                                                0.8.7g 2010-07-09
                                                                                http://www.cacti.net/


                                                                            •   mrtg
                •      n2rrd/rrd2graph
                       1.4.4 2011-08-16
                                                                                2.17.1 2011-02-18
                                                                                http://oss.oetiker.ch/mrtg/
                       http://n2rrd-wiki.diglinks.com/display/n2rrd/Addon


                •      pnp4nagios
                       0.6.15 2011-09-14
                       http://pnp4nagios.sourceforge.net/


Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
What is the problem?




                                                                                       v0.6 ©2011 Matthew Wall, all rights reserved
                •      Nagios indicates current status

                •      Nagios Core trending consists only of states and notifications

                •      Nagios Core does not provide performance trending




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
What is the problem?




                                                                                       v0.6 ©2011 Matthew Wall, all rights reserved
                •      Nagios indicates current status

                •      Nagios Core trending consists only of states and notifications

                •      Nagios Core does not provide performance trending




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
What is the problem?




                                                                                       v0.6 ©2011 Matthew Wall, all rights reserved
                •      Nagios indicates current status

                •      Nagios Core trending consists only of states and notifications

                •      Nagios Core does not provide performance trending




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
What is the problem?




                                                                                       v0.6 ©2011 Matthew Wall, all rights reserved
                •      Nagios indicates current status

                •      Nagios Core trending consists only of states and notifications

                •      Nagios Core does not provide performance trending




                                                      ?
Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Why is this a problem?




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
                •      How do you figure out which notifications matter?

                •      How do you know what the thresholds should be?

                •      What is happening between notifications?

                •      What caused the known disasters?

                •      How to predict the unanticipated disasters?



Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Show me some examples...




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
                •      Why do the temperature alarms go off each day?
                       UPS temperature monitoring

                •      How close do we come to exceeding thresholds?
                       Software license use

                •      How can we understand dynamic environments?
                       Cross-platform distributed build/test environment



Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Temperature Cycles




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
   19 M
   20 T
   21 W
   22 Th
   23 F
   24 S
   25 Su




   This exception
   tipped us off
Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Under the Thresholds




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
 What is happening
 when we are not
  being notified?




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Changing Thresholds




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
                               Track the changes to the requirements,
                                  not just the changes to the data.


Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Dynamic Targets




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Dynamic Targets




                                                                                          v0.6 ©2011 Matthew Wall, all rights reserved
                               What is the source of the traffic spike in this interval?
Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Dynamic Targets




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
                    vm15 is active here        vm16 is active here
Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Dynamic Targets




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Trending is not just drawing graphs




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
                •      Catch problems before they become disasters

                •      Provide context for discovering patterns

                •      Data correlation and comparison




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
So what should a performance trending system do?




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
                    Display thresholds as well as performance data




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
So what should a performance trending system do?




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
            Display all services
            for a specified host




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
So what should a performance trending system do?




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
        Display all hosts that
        have a specified service




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
So what should a performance trending system do?




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
       Display arbitrary
       groups of host/
       service data




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
So what should a performance trending system do?




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
                    Provide interactive queries as well as
                    canned reports




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
So what should a performance trending system do?




                                                                                        v0.6 ©2011 Matthew Wall, all rights reserved
                •      Display thresholds as well as performance data
                •      Display all services for a specified host
                •      Display all hosts with a specified service
                •      Display arbitrary groups of host/service data
                •      Provide interactive queries as well as canned reports
                •      Compare data from any host/service with any other host/service
                •      Compare data from any two periods of time
                •      Provide export of data for analysis

                •      Easy to use
                •      Easy on the eyes
                •      Easy to configure

Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Graphing and Trending in Nagios




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
                               •   Data Collection



                               •   Data Storage



                               •   Data Display



Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Data Collection




                                                                                    v0.6 ©2011 Matthew Wall, all rights reserved
                                                      •   How to do it in Nagios?
       commands.cfg                                       •Immediate
                                       perfdata.log
                                                          •Batch
                          NG                              •Shared library
        map               insert.pl
                                                          •External process

                               ?
                          data store




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Data Collection




                                                                                                                                                          v0.6 ©2011 Matthew Wall, all rights reserved
                                                                            map
                                                                            # Service type: ping
       commands.cfg                                                         #   output:PING OK - Packet loss = 0%, RTA = 0.00 ms
                                                                            /output:PING.*?(d+)%.+?([.d]+)sms/
                                               perfdata.log                 and push @s, [ 'pingloss',
                                                                                           [ 'losspct', GAUGE, $1 ]]

                          NG                                                and push @s, [ 'pingrta',
                                                                                           [ 'rta', GAUGE, $2/1000 ]];
                          insert.pl
        map


                                                              perfdata.log
                               ?                              1317218378||yarg||mailq||OK: mailq reports queue is empty||unsent=0;5;20;0
                          data store                          1317218379||http01||ups-temp||OK - Internal Temperature: 36.9 C||temperature=36.9;45;48
                                                              1317218379||power3||ups-temp||OK - Internal Temperature: 42.7 C||temperature=42.7;45;48




                                       commands.cfg
                                       process_performance_data=1
                                       service_perfdata_file=/var/nagios/perfdata.log
                                       service_perfdata_file_template=$LASTSERVICECHECK$||$HOSTNAME$||$SERVICEDESC$||$SERVICEOUTPUT$||$SERVICEPERFDATA$
                                       service_perfdata_file_mode=a
                                       service_perfdata_file_processing_interval=30
                                       service_perfdata_file_processing_command=process-service-perfdata



Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Data Collection




                                                                                                           v0.6 ©2011 Matthew Wall, all rights reserved
                                                      •   How to do it in Nagios?
       commands.cfg                                       •Immediate
                                       perfdata.log
                                                          •Batch
                          NG                              •Shared library
        map               insert.pl
                                                          •External process

                               ?                      •   Issues
                          data store
                                                          • Performance data
                                                          • Plugin output
                                                          • Data from plugins or data from Nagios itself
                                                          • Sampling interval
                                                          • Sampling precision
                                                          • Is Nagios the best tool for data collection?

Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Data Storage




                                                                                                      v0.6 ©2011 Matthew Wall, all rights reserved
                                                                •   How to do it?
       commands.cfg                                                 •Round-Robin Database (rrdtool)
                                                 perfdata.log
                                                                    •SQL Database (mySQL)

        map
                          NG
                          insert.pl
                                                                    •JavaDB



                                      RRD files




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Data Storage




                                                                                                                                             v0.6 ©2011 Matthew Wall, all rights reserved
                                                                rrdtool update
                                                                DS:inOctets:COUNTER:120:0:4294967296
                                                                RRA:AVERAGE:.5:1:43200
                                                                RRA:AVERAGE:.5:5:105120
       commands.cfg
                                                                RRA:AVERAGE:.5:10:105120
                                                 perfdata.log

                          NG                                    ls -l /var/nagiosgraph/rrd/*
                          insert.pl
        map
                                                                /var/nagiosgraph/rrd/www:
                                                                total 72
                                                                -rw-rw-r-- 1 nagios nagios 24120 2011-09-28 10:00 http___http.rrd
                                                                -rw-rw-r-- 1 nagios nagios 24120 2011-09-28 10:00 http___http.rrd_max
                                                                -rw-rw-r-- 1 nagios nagios 24120 2011-09-28 10:00 http___http.rrd_min

                                      RRD files

                                                                rrdtool dump servicedesc___ds.rrd
                                                                <?xml version="1.0" encoding="utf-8"?>
                                                                <!DOCTYPE rrd SYSTEM "http://oss.oetiker.ch/rrdtool/rrdtool.dtd">
                                                                <!-- Round Robin Database Dump -->
                                                                <rrd>
                                                                	     <version>0003</version>
                                                                	     <step>300</step> <!-- Seconds -->
                                                                	     <lastupdate>1317218410</lastupdate> <!-- 2011-09-28 10:00:10 EDT -->
                                                                ...
                                                                </rrd>




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Data Storage




                                                                                                      v0.6 ©2011 Matthew Wall, all rights reserved
                                                                •   How to do it?
       commands.cfg                                                 •Round-Robin Database (rrdtool)
                                                 perfdata.log
                                                                    •SQL Database (mySQL)

        map
                          NG
                          insert.pl
                                                                    •JavaDB

                                                                •   Issues
                                                                    • Schema definition
                                      RRD files
                                                                    • Storage space limitations
                                                                    • Storage space pruning
                                                                    • Redundancy
                                                                    • Backups



Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Data Display




                                                                                              v0.6 ©2011 Matthew Wall, all rights reserved
                                                                •   How to do it?
       commands.cfg                                                 •CGI (PERL+rrdtool)
                                                 perfdata.log
                                                                    •PHP (PHP+PERL+rrdtool)
                          NG                                        •JavaScript
        map               insert.pl
                                                                    •Google Charts


                                      RRD files




                          NG
                          show*.cgi
   nagiosgraph.conf


                          Apache

Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Data Display




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
       commands.cfg
                                                 perfdata.log

                          NG
                          insert.pl
        map




                                      RRD files




                          NG
                          show*.cgi
   nagiosgraph.conf


                          Apache

Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Data Display




                                                                                                                           v0.6 ©2011 Matthew Wall, all rights reserved
                                                                •   How to do it?
       commands.cfg                                                 •CGI (PERL+rrdtool)
                                                 perfdata.log
                                                                    •PHP (PHP+PERL+rrdtool)
                          NG                                        •JavaScript
        map               insert.pl
                                                                    •Google Charts

                                                                •   Issues
                                      RRD files
                                                                    • Today, yesterday, last week, last month, last year
                                                                    • Single host/service/source
                          NG                                        • Combinations of hosts/services/sources
                          show*.cgi
                                                                    • Canned reports
                                                                    •
   nagiosgraph.conf
                                                                      Interactive queries
                          Apache

Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
What are the options?




                                                                                                                                              v0.6 ©2011 Matthew Wall, all rights reserved
       commands.cfg                                             •   nagiosgraph
                                                                    1.4.4 2011-01-16
                                                                                                            •   cacti
                                                                                                                0.8.7g 2010-07-09
                                                 perfdata.log                                                   http://www.cacti.net/
                                                                    http://nagiosgraph.sourceforge.net/
                          NG
        map               insert.pl
                                                                •   nagiosgrapher
                                                                    1.7.1 2008-12-18
                                                                                                            •   mrtg
                                                                                                                2.17.1 2011-02-18
                                                                                                                http://oss.oetiker.ch/mrtg/


                                                                •
                                      RRD files
                                                                    n2rrd/rrd2graph
                                                                    1.4.4 2011-08-16
                                                                    http://n2rrd-wiki.diglinks.com/display/n2rrd/Addon
                          NG
                                                                •
                          show*.cgi
   nagiosgraph.conf
                                                                    pnp4nagios
                                                                    0.6.15 2011-09-14
                          Apache                                    http://pnp4nagios.sourceforge.net/


Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
cacti




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
  • Standalone system
  • Data collection and/or display
  • Browsing
  • Querying
  • Zoom




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
mrtg




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
  • Standalone system designed for SNMP
  • Data collection and/or display




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
n2rrd and rrd2graph




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
  • Data collection (n2rrd)
  • Data display (rrd2graph)
  • Template-based RRA
  • Template-based graphs
  • All services per host
  • Arbitrary grouping
  • Interactive selection of data
  • Zoom (in new context)
  • Export graphs as PDF, PNG,
    EPS, SVG
  • rrdtool, PERL




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
pnp4nagios




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
  • Data collection and display
  • Template-based graphs
  • All services per host
  • Arbitrary grouping
  • Arbitrary time interval
  • Zoom (in new context)
  • Mouseover thumbnail graphs
  • Export data as CSV
  • Export graphs as PDF, PNG
  • rrdtool, C, PHP, PERL, jQuery




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
nagiosgraph




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
  • Data collection and display
  • Parameter-based RRA
  • Parameter-based graphing
  • All services per host
  • All hosts per service
  • Arbitrary grouping
  • Arbitrary time interval
  • Zoom (in place)
  • Interactive selection of data
  • Mouseover thumbnail graphs
  • Export data as CSV, XML
  • rrdtool, PERL, JavaScript




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Issues




                                                                                   v0.6 ©2011 Matthew Wall, all rights reserved
                •      Is Nagios the right tool for collecting performance data?
                •      Which add-on/system should I use?
                •      Performance data versus plugin output
                •      Seeing both the forest and the trees
                •      How much data to collect? How much to save?
                •      Getting the RRA parameters right
                •      Dealing with rigid schemas
                •      What format to save the data? (mysql, rrdtool)
                •      Automatic provisioning/discovery/configuration
                •      Transient hosts/services
                •      Data freshness


Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Is Nagios the right tool?




                                                                                    v0.6 ©2011 Matthew Wall, all rights reserved
                •      Nagios checks have access to performance data, so why not?

                •      No need to install additional software

                •      Confounding of state and performance data

                •      Does Nagios collect data often enough?

                •      What happens to the data when Nagios cannot collect it?



Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Which system(s) should I use?




                                                                                                                     v0.6 ©2011 Matthew Wall, all rights reserved
      Collection: nagiosgraph   Collection: pnp4nagios                      Collection: cacti   Collection: Nagios
      Storage: rrdtool          Storage: rrdtool                            Storage: rrdtool    Storage: rrdtool
      Glue: nagiosgraph         Glue: pnp4nagios                            Glue: cacti         Glue: n2rrd
      Display: nagiosgraph      Display: pnp4nagios                         Display: cacti      Display: cacti

Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Which add-on(s) should I use?




                                                                                                                                                             v0.6 ©2011 Matthew Wall, all rights reserved
                                       n2rrd            rrd2graph                pnp4nagios                 nagiosgraph           cacti


                                                                                                                                           e !     mrtg




                                                                                                                            m
                configuration         templates          templates                 templates                 parameters          templates        templates

                dependencies       rrdtool, PERL      rrdtool, PERL      rrdtool, PERL, PHP, jQuery        rrdtool, PERL


                                                                                                                         h o        ?                ?




                                                                                                         t
                   storage            rrdtool                                      rrdtool                    rrdtool            rrdtool          rrdtool

                  collection      immediate, batch


                                                                                                      s
                                                                       immediate, batch, shared library
                                                                                                        a immediate, batch       SNMP             SNMP

                    display                                cgi                    php + cgi



                                                                                           t h i                cgi                cgi             html




                                                                             y
                   zooming                           separate window          separate window                 in-place       separate window

             graph mouseovers


                                                                         t r         yes                        yes




                                                      e
          coordinate mouseovers                                                                                 yes

              arbitrary groups

                                                    as                               yes                        yes

                    search

                   browse
                                                Ple        yes

                                                                                     yes                        yes
                                                                                                                                   yes

                                                                                                                                   yes


Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Performance Data




                                                                                                     v0.6 ©2011 Matthew Wall, all rights reserved
              name = value[units];[warn];[crit];[min];[max]

           where units is one of:
                                       unitless
                          s,us,ms      time
                                %      percentage
                B,KB,MB,GB,TB,PB       bytes
                                c      counter

                                                               Beware of the bug in Nagios 3.3.1 !


Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
How to see the forest and the trees?




                                                                                          v0.6 ©2011 Matthew Wall, all rights reserved
                •      You never know what you’ll need until long after you can save it

Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
How to see the forest and the trees?




                                                                                          v0.6 ©2011 Matthew Wall, all rights reserved
                •      You never know what you’ll need until long after you can save it

Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
How to see the forest and the trees?




                                                                                          v0.6 ©2011 Matthew Wall, all rights reserved
                •      You never know what you’ll need until long after you can save it




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
How to see the forest and the trees?




                                                                                          v0.6 ©2011 Matthew Wall, all rights reserved
                •      You never know what you’ll need until long after you can save it

                •      With rrdtool, the further back you go, the more you lose




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
How to see the forest and the trees?




                                                                                            v0.6 ©2011 Matthew Wall, all rights reserved
                •      You never know what you’ll need until long after you can save it

                •      With rrdtool, the further back you go, the more you lose




                               Archaeology
                                                                                  Zooming

Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
How much to collect and save?




                                                                                             v0.6 ©2011 Matthew Wall, all rights reserved
                •      Collect the source data, not the derivative data

                •      Collect everything - you can stop collecting later

                •      Collect often - let profiling dictate when to collect less often

                •      Save everything - you can throw it away later

                •      Using RRD ensures that your system scales by host/service, not time



Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Getting the RRA parameters right




                                                                                                                                             v0.6 ©2011 Matthew Wall, all rights reserved
              DS:NAME:TYPE:HEARTBEAT:MIN:MAX
              RRA:CONSOLIDATION_METHOD:XFF:PDPs:CDPs



              DS:inOctets:COUNTER:120:0:4294967296
              RRA:AVERAGE:.5:1:43200
              RRA:AVERAGE:.5:5:105120
              RRA:AVERAGE:.5:10:105120




             XFF: x files factor
             PDP: primary data point
             CDP: consolidated data point


                                                                   Building a Monitoring Infrastructure with Nagios, David Josephson, 2007




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
Rigid Schemas




                                                                                         v0.6 ©2011 Matthew Wall, all rights reserved
                •      Put one data source in each RRD file, plus associated thresholds

                •      Use consistent service names

                •      Use service description based on plugin, not platform

                •      Keep the specifics of the schema in the glue layer

                •      Schemas are not just an issue with rrdtool



Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
So where are we?




                                                                                              v0.6 ©2011 Matthew Wall, all rights reserved
                               There are a few free tools, and a few more not-so-free tools




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
So where are we?




                                                                                              v0.6 ©2011 Matthew Wall, all rights reserved
                               There are a few free tools, and a few more not-so-free tools

                                              All of the existing tools suck...




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
So where are we?




                                                                                              v0.6 ©2011 Matthew Wall, all rights reserved
                               There are a few free tools, and a few more not-so-free tools

                                              All of the existing tools suck...

                                   but at least one of them is probably good enough...




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
So where are we?




                                                                                              v0.6 ©2011 Matthew Wall, all rights reserved
                               There are a few free tools, and a few more not-so-free tools

                                              All of the existing tools suck...

                                   but at least one of them is probably good enough...

                                         and many of them continue to progress.




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
nagiosgraph: then and now




                                                                                    v0.6 ©2011 Matthew Wall, all rights reserved
                               2009                                          2011
Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
nagiosgraph: history and status




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
             •      first release was 2004 - Soren Dossing
             •      release 0.1 (2004-08-04) was 16KB (compressed)
             •      release 1.4.4 (2011-01-16) was 158KB (compressed)
             •      18 project members, 2 current (Alan Brennar, Matthew Wall)
             •      typically 70-100 downloads per day (20 on weekends)
             •      packages for deb and rpm added Jan 2011
             •      1259 unit tests providing 78.5% code coverage
             •      155KB perl code, 44KB javascript/css, 276KB unit test code




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
nagiosgraph: What next?




                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
                •      Arbitrary combinations of data sources
                •      Interactive manipulation of data sources
                •      Management of stale data
                •      Export of data
                •      Template-based RRAs and graphs
                •      Better multi-byte character support
                •      More unit tests and code coverage




Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
The tr/end//




                                                                                                  v0.6 ©2011 Matthew Wall, all rights reserved
                                                                                    nagiosgraph
                                                                                    Screenshot




     Circonius Dashboard Prototype




                                                                 Cacti Screenshot

Introduction • Problem • Requirements • Components • Options • Issues • Summary
Wednesday, 28 September 2011
References




                                                                          v0.6 ©2011 Matthew Wall, all rights reserved
                •      http://lancet.mit.edu/mwall/projects/nagios


                •      http://nagiosgraph.sourceforge.net


                •      http://www.scribd.com/doc/58991647
                       Building a Monitoring Infrastructure with Nagios
                       David Josephson 2007


                •      https://labs.omniti.com/labs/reconnoiter


Wednesday, 28 September 2011

More Related Content

Similar to Nagios Conference 2011 - Matt Wall - Performance Graphing and Trending In Nagios

Stop the Line practice in SW development
Stop the Line practice in SW developmentStop the Line practice in SW development
Stop the Line practice in SW developmentGabor Gunyho
 
State of jQuery June 2013 - Portland
State of jQuery June 2013 - PortlandState of jQuery June 2013 - Portland
State of jQuery June 2013 - Portlanddmethvin
 
GlassFish Community Update @ JavaOne 2011
GlassFish Community Update @ JavaOne 2011GlassFish Community Update @ JavaOne 2011
GlassFish Community Update @ JavaOne 2011Arun Gupta
 
Donating a mature project to Eclipse
Donating a mature project to EclipseDonating a mature project to Eclipse
Donating a mature project to Eclipseglynnormington
 
Monitoring is easy, why are we so bad at it presentation
Monitoring is easy, why are we so bad at it  presentationMonitoring is easy, why are we so bad at it  presentation
Monitoring is easy, why are we so bad at it presentationTheo Schlossnagle
 
Anne Thomas Manes S O A Report Card
Anne  Thomas Manes    S O A  Report  CardAnne  Thomas Manes    S O A  Report  Card
Anne Thomas Manes S O A Report CardSOA Symposium
 
PyCon 2011 Scaling Disqus
PyCon 2011 Scaling DisqusPyCon 2011 Scaling Disqus
PyCon 2011 Scaling Disquszeeg
 
Enterprise PHP Development - Ivo Jansch
Enterprise PHP Development - Ivo JanschEnterprise PHP Development - Ivo Jansch
Enterprise PHP Development - Ivo Janschdpc
 
OU Media Player at a11yLDN 2012
OU Media Player at a11yLDN 2012OU Media Player at a11yLDN 2012
OU Media Player at a11yLDN 2012Nick Freear
 
jQuery Conference 2012 keynote
jQuery Conference 2012 keynotejQuery Conference 2012 keynote
jQuery Conference 2012 keynotedmethvin
 
Microservices - Scaling Development and Service
Microservices - Scaling Development and ServiceMicroservices - Scaling Development and Service
Microservices - Scaling Development and ServicePaulo Gaspar
 
Getting started w ct lite load_testing 21.05.14
Getting started w ct lite load_testing 21.05.14Getting started w ct lite load_testing 21.05.14
Getting started w ct lite load_testing 21.05.14SOASTA
 
SecurityBSides las vegas - Agnitio
SecurityBSides las vegas - AgnitioSecurityBSides las vegas - Agnitio
SecurityBSides las vegas - AgnitioSecurity Ninja
 
The pushing of programs and operating systems
The pushing of programs and operating systemsThe pushing of programs and operating systems
The pushing of programs and operating systemsJoseph Jones
 
Java EE and Google App Engine
Java EE and Google App EngineJava EE and Google App Engine
Java EE and Google App EngineArun Gupta
 
Building Scalable Systems: an asynchronous approach
Building Scalable Systems: an asynchronous approachBuilding Scalable Systems: an asynchronous approach
Building Scalable Systems: an asynchronous approachTheo Schlossnagle
 
Java / Opening Open Source the Jenkins Way - Nicolas de Loof, CloudBees
Java / Opening Open Source the Jenkins Way - Nicolas de Loof, CloudBeesJava / Opening Open Source the Jenkins Way - Nicolas de Loof, CloudBees
Java / Opening Open Source the Jenkins Way - Nicolas de Loof, CloudBeesParis Open Source Summit
 
Governing services, data, rules, processes and more
Governing services, data, rules, processes and moreGoverning services, data, rules, processes and more
Governing services, data, rules, processes and moreRandall Hauch
 

Similar to Nagios Conference 2011 - Matt Wall - Performance Graphing and Trending In Nagios (20)

Stop the Line practice in SW development
Stop the Line practice in SW developmentStop the Line practice in SW development
Stop the Line practice in SW development
 
State of jQuery June 2013 - Portland
State of jQuery June 2013 - PortlandState of jQuery June 2013 - Portland
State of jQuery June 2013 - Portland
 
GlassFish Community Update @ JavaOne 2011
GlassFish Community Update @ JavaOne 2011GlassFish Community Update @ JavaOne 2011
GlassFish Community Update @ JavaOne 2011
 
How medium uses Neo4j
How medium uses Neo4jHow medium uses Neo4j
How medium uses Neo4j
 
Donating a mature project to Eclipse
Donating a mature project to EclipseDonating a mature project to Eclipse
Donating a mature project to Eclipse
 
Monitoring is easy, why are we so bad at it presentation
Monitoring is easy, why are we so bad at it  presentationMonitoring is easy, why are we so bad at it  presentation
Monitoring is easy, why are we so bad at it presentation
 
Anne Thomas Manes S O A Report Card
Anne  Thomas Manes    S O A  Report  CardAnne  Thomas Manes    S O A  Report  Card
Anne Thomas Manes S O A Report Card
 
DevOps Days Ohio
DevOps Days OhioDevOps Days Ohio
DevOps Days Ohio
 
PyCon 2011 Scaling Disqus
PyCon 2011 Scaling DisqusPyCon 2011 Scaling Disqus
PyCon 2011 Scaling Disqus
 
Enterprise PHP Development - Ivo Jansch
Enterprise PHP Development - Ivo JanschEnterprise PHP Development - Ivo Jansch
Enterprise PHP Development - Ivo Jansch
 
OU Media Player at a11yLDN 2012
OU Media Player at a11yLDN 2012OU Media Player at a11yLDN 2012
OU Media Player at a11yLDN 2012
 
jQuery Conference 2012 keynote
jQuery Conference 2012 keynotejQuery Conference 2012 keynote
jQuery Conference 2012 keynote
 
Microservices - Scaling Development and Service
Microservices - Scaling Development and ServiceMicroservices - Scaling Development and Service
Microservices - Scaling Development and Service
 
Getting started w ct lite load_testing 21.05.14
Getting started w ct lite load_testing 21.05.14Getting started w ct lite load_testing 21.05.14
Getting started w ct lite load_testing 21.05.14
 
SecurityBSides las vegas - Agnitio
SecurityBSides las vegas - AgnitioSecurityBSides las vegas - Agnitio
SecurityBSides las vegas - Agnitio
 
The pushing of programs and operating systems
The pushing of programs and operating systemsThe pushing of programs and operating systems
The pushing of programs and operating systems
 
Java EE and Google App Engine
Java EE and Google App EngineJava EE and Google App Engine
Java EE and Google App Engine
 
Building Scalable Systems: an asynchronous approach
Building Scalable Systems: an asynchronous approachBuilding Scalable Systems: an asynchronous approach
Building Scalable Systems: an asynchronous approach
 
Java / Opening Open Source the Jenkins Way - Nicolas de Loof, CloudBees
Java / Opening Open Source the Jenkins Way - Nicolas de Loof, CloudBeesJava / Opening Open Source the Jenkins Way - Nicolas de Loof, CloudBees
Java / Opening Open Source the Jenkins Way - Nicolas de Loof, CloudBees
 
Governing services, data, rules, processes and more
Governing services, data, rules, processes and moreGoverning services, data, rules, processes and more
Governing services, data, rules, processes and more
 

More from Nagios

Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best PracticesNagios
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewNagios
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The HoodNagios
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsNagios
 
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionMarcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionNagios
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsNagios
 
Dave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceDave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceNagios
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksNagios
 
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationNagios
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Nagios
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosNagios
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Nagios
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosNagios
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Nagios
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Nagios
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNagios
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - FeaturesNagios
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios
 

More from Nagios (20)

Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best Practices
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The Hood
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient Notifications
 
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionMarcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios Plugins
 
Dave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceDave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical Experience
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service Checks
 
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
 
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With Nagios
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal Nagios
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson Opening
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - Features
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - Features
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
 

Recently uploaded

Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 

Recently uploaded (20)

Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

Nagios Conference 2011 - Matt Wall - Performance Graphing and Trending In Nagios

  • 1. Graphing and Trending in Nagios Matthew Wall mwall@users.sourceforge.net September 2011 v0.6 Wednesday, 28 September 2011
  • 2. Agenda v0.6 ©2011 Matthew Wall, all rights reserved • What is the problem? • What should a trending system do? • What are the parts? • What options are available? • What issues need to be considered? Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 3. Background v0.6 ©2011 Matthew Wall, all rights reserved • Small Nagios installations with 40-80 hosts and 500-2000 services Nagios Experience • Small businesses with 10-20 servers and 20-40 workstations • Continuous build environments with 30+ virtual machines • Power, water, septic, and weather monitoring on an island in Maine • Databases and ticketing system for pop singer • Design optimization and supply chain optimization Day Job • Budget: low Context • Costs: time is not free • Training: ok for expert to setup, not ok for expert to operate • Hack Factor: rather high Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 4. What are the options? v0.6 ©2011 Matthew Wall, all rights reserved • nagiosgraph 1.4.4 2011-01-16 http://nagiosgraph.sourceforge.net/ • cacti • nagiosgrapher 1.7.1 2008-12-18 0.8.7g 2010-07-09 http://www.cacti.net/ • mrtg • n2rrd/rrd2graph 1.4.4 2011-08-16 2.17.1 2011-02-18 http://oss.oetiker.ch/mrtg/ http://n2rrd-wiki.diglinks.com/display/n2rrd/Addon • pnp4nagios 0.6.15 2011-09-14 http://pnp4nagios.sourceforge.net/ Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 5. What is the problem? v0.6 ©2011 Matthew Wall, all rights reserved • Nagios indicates current status • Nagios Core trending consists only of states and notifications • Nagios Core does not provide performance trending Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 6. What is the problem? v0.6 ©2011 Matthew Wall, all rights reserved • Nagios indicates current status • Nagios Core trending consists only of states and notifications • Nagios Core does not provide performance trending Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 7. What is the problem? v0.6 ©2011 Matthew Wall, all rights reserved • Nagios indicates current status • Nagios Core trending consists only of states and notifications • Nagios Core does not provide performance trending Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 8. What is the problem? v0.6 ©2011 Matthew Wall, all rights reserved • Nagios indicates current status • Nagios Core trending consists only of states and notifications • Nagios Core does not provide performance trending ? Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 9. Why is this a problem? v0.6 ©2011 Matthew Wall, all rights reserved • How do you figure out which notifications matter? • How do you know what the thresholds should be? • What is happening between notifications? • What caused the known disasters? • How to predict the unanticipated disasters? Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 10. Show me some examples... v0.6 ©2011 Matthew Wall, all rights reserved • Why do the temperature alarms go off each day? UPS temperature monitoring • How close do we come to exceeding thresholds? Software license use • How can we understand dynamic environments? Cross-platform distributed build/test environment Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 11. Temperature Cycles v0.6 ©2011 Matthew Wall, all rights reserved 19 M 20 T 21 W 22 Th 23 F 24 S 25 Su This exception tipped us off Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 12. Under the Thresholds v0.6 ©2011 Matthew Wall, all rights reserved What is happening when we are not being notified? Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 13. Changing Thresholds v0.6 ©2011 Matthew Wall, all rights reserved Track the changes to the requirements, not just the changes to the data. Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 14. Dynamic Targets v0.6 ©2011 Matthew Wall, all rights reserved Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 15. Dynamic Targets v0.6 ©2011 Matthew Wall, all rights reserved What is the source of the traffic spike in this interval? Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 16. Dynamic Targets v0.6 ©2011 Matthew Wall, all rights reserved vm15 is active here vm16 is active here Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 17. Dynamic Targets v0.6 ©2011 Matthew Wall, all rights reserved Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 18. Trending is not just drawing graphs v0.6 ©2011 Matthew Wall, all rights reserved • Catch problems before they become disasters • Provide context for discovering patterns • Data correlation and comparison Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 19. So what should a performance trending system do? v0.6 ©2011 Matthew Wall, all rights reserved Display thresholds as well as performance data Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 20. So what should a performance trending system do? v0.6 ©2011 Matthew Wall, all rights reserved Display all services for a specified host Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 21. So what should a performance trending system do? v0.6 ©2011 Matthew Wall, all rights reserved Display all hosts that have a specified service Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 22. So what should a performance trending system do? v0.6 ©2011 Matthew Wall, all rights reserved Display arbitrary groups of host/ service data Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 23. So what should a performance trending system do? v0.6 ©2011 Matthew Wall, all rights reserved Provide interactive queries as well as canned reports Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 24. So what should a performance trending system do? v0.6 ©2011 Matthew Wall, all rights reserved • Display thresholds as well as performance data • Display all services for a specified host • Display all hosts with a specified service • Display arbitrary groups of host/service data • Provide interactive queries as well as canned reports • Compare data from any host/service with any other host/service • Compare data from any two periods of time • Provide export of data for analysis • Easy to use • Easy on the eyes • Easy to configure Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 25. Graphing and Trending in Nagios v0.6 ©2011 Matthew Wall, all rights reserved • Data Collection • Data Storage • Data Display Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 26. Data Collection v0.6 ©2011 Matthew Wall, all rights reserved • How to do it in Nagios? commands.cfg •Immediate perfdata.log •Batch NG •Shared library map insert.pl •External process ? data store Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 27. Data Collection v0.6 ©2011 Matthew Wall, all rights reserved map # Service type: ping commands.cfg # output:PING OK - Packet loss = 0%, RTA = 0.00 ms /output:PING.*?(d+)%.+?([.d]+)sms/ perfdata.log and push @s, [ 'pingloss', [ 'losspct', GAUGE, $1 ]] NG and push @s, [ 'pingrta', [ 'rta', GAUGE, $2/1000 ]]; insert.pl map perfdata.log ? 1317218378||yarg||mailq||OK: mailq reports queue is empty||unsent=0;5;20;0 data store 1317218379||http01||ups-temp||OK - Internal Temperature: 36.9 C||temperature=36.9;45;48 1317218379||power3||ups-temp||OK - Internal Temperature: 42.7 C||temperature=42.7;45;48 commands.cfg process_performance_data=1 service_perfdata_file=/var/nagios/perfdata.log service_perfdata_file_template=$LASTSERVICECHECK$||$HOSTNAME$||$SERVICEDESC$||$SERVICEOUTPUT$||$SERVICEPERFDATA$ service_perfdata_file_mode=a service_perfdata_file_processing_interval=30 service_perfdata_file_processing_command=process-service-perfdata Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 28. Data Collection v0.6 ©2011 Matthew Wall, all rights reserved • How to do it in Nagios? commands.cfg •Immediate perfdata.log •Batch NG •Shared library map insert.pl •External process ? • Issues data store • Performance data • Plugin output • Data from plugins or data from Nagios itself • Sampling interval • Sampling precision • Is Nagios the best tool for data collection? Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 29. Data Storage v0.6 ©2011 Matthew Wall, all rights reserved • How to do it? commands.cfg •Round-Robin Database (rrdtool) perfdata.log •SQL Database (mySQL) map NG insert.pl •JavaDB RRD files Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 30. Data Storage v0.6 ©2011 Matthew Wall, all rights reserved rrdtool update DS:inOctets:COUNTER:120:0:4294967296 RRA:AVERAGE:.5:1:43200 RRA:AVERAGE:.5:5:105120 commands.cfg RRA:AVERAGE:.5:10:105120 perfdata.log NG ls -l /var/nagiosgraph/rrd/* insert.pl map /var/nagiosgraph/rrd/www: total 72 -rw-rw-r-- 1 nagios nagios 24120 2011-09-28 10:00 http___http.rrd -rw-rw-r-- 1 nagios nagios 24120 2011-09-28 10:00 http___http.rrd_max -rw-rw-r-- 1 nagios nagios 24120 2011-09-28 10:00 http___http.rrd_min RRD files rrdtool dump servicedesc___ds.rrd <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE rrd SYSTEM "http://oss.oetiker.ch/rrdtool/rrdtool.dtd"> <!-- Round Robin Database Dump --> <rrd> <version>0003</version> <step>300</step> <!-- Seconds --> <lastupdate>1317218410</lastupdate> <!-- 2011-09-28 10:00:10 EDT --> ... </rrd> Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 31. Data Storage v0.6 ©2011 Matthew Wall, all rights reserved • How to do it? commands.cfg •Round-Robin Database (rrdtool) perfdata.log •SQL Database (mySQL) map NG insert.pl •JavaDB • Issues • Schema definition RRD files • Storage space limitations • Storage space pruning • Redundancy • Backups Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 32. Data Display v0.6 ©2011 Matthew Wall, all rights reserved • How to do it? commands.cfg •CGI (PERL+rrdtool) perfdata.log •PHP (PHP+PERL+rrdtool) NG •JavaScript map insert.pl •Google Charts RRD files NG show*.cgi nagiosgraph.conf Apache Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 33. Data Display v0.6 ©2011 Matthew Wall, all rights reserved commands.cfg perfdata.log NG insert.pl map RRD files NG show*.cgi nagiosgraph.conf Apache Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 34. Data Display v0.6 ©2011 Matthew Wall, all rights reserved • How to do it? commands.cfg •CGI (PERL+rrdtool) perfdata.log •PHP (PHP+PERL+rrdtool) NG •JavaScript map insert.pl •Google Charts • Issues RRD files • Today, yesterday, last week, last month, last year • Single host/service/source NG • Combinations of hosts/services/sources show*.cgi • Canned reports • nagiosgraph.conf Interactive queries Apache Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 35. What are the options? v0.6 ©2011 Matthew Wall, all rights reserved commands.cfg • nagiosgraph 1.4.4 2011-01-16 • cacti 0.8.7g 2010-07-09 perfdata.log http://www.cacti.net/ http://nagiosgraph.sourceforge.net/ NG map insert.pl • nagiosgrapher 1.7.1 2008-12-18 • mrtg 2.17.1 2011-02-18 http://oss.oetiker.ch/mrtg/ • RRD files n2rrd/rrd2graph 1.4.4 2011-08-16 http://n2rrd-wiki.diglinks.com/display/n2rrd/Addon NG • show*.cgi nagiosgraph.conf pnp4nagios 0.6.15 2011-09-14 Apache http://pnp4nagios.sourceforge.net/ Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 36. cacti v0.6 ©2011 Matthew Wall, all rights reserved • Standalone system • Data collection and/or display • Browsing • Querying • Zoom Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 37. mrtg v0.6 ©2011 Matthew Wall, all rights reserved • Standalone system designed for SNMP • Data collection and/or display Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 38. n2rrd and rrd2graph v0.6 ©2011 Matthew Wall, all rights reserved • Data collection (n2rrd) • Data display (rrd2graph) • Template-based RRA • Template-based graphs • All services per host • Arbitrary grouping • Interactive selection of data • Zoom (in new context) • Export graphs as PDF, PNG, EPS, SVG • rrdtool, PERL Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 39. pnp4nagios v0.6 ©2011 Matthew Wall, all rights reserved • Data collection and display • Template-based graphs • All services per host • Arbitrary grouping • Arbitrary time interval • Zoom (in new context) • Mouseover thumbnail graphs • Export data as CSV • Export graphs as PDF, PNG • rrdtool, C, PHP, PERL, jQuery Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 40. nagiosgraph v0.6 ©2011 Matthew Wall, all rights reserved • Data collection and display • Parameter-based RRA • Parameter-based graphing • All services per host • All hosts per service • Arbitrary grouping • Arbitrary time interval • Zoom (in place) • Interactive selection of data • Mouseover thumbnail graphs • Export data as CSV, XML • rrdtool, PERL, JavaScript Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 41. Issues v0.6 ©2011 Matthew Wall, all rights reserved • Is Nagios the right tool for collecting performance data? • Which add-on/system should I use? • Performance data versus plugin output • Seeing both the forest and the trees • How much data to collect? How much to save? • Getting the RRA parameters right • Dealing with rigid schemas • What format to save the data? (mysql, rrdtool) • Automatic provisioning/discovery/configuration • Transient hosts/services • Data freshness Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 42. Is Nagios the right tool? v0.6 ©2011 Matthew Wall, all rights reserved • Nagios checks have access to performance data, so why not? • No need to install additional software • Confounding of state and performance data • Does Nagios collect data often enough? • What happens to the data when Nagios cannot collect it? Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 43. Which system(s) should I use? v0.6 ©2011 Matthew Wall, all rights reserved Collection: nagiosgraph Collection: pnp4nagios Collection: cacti Collection: Nagios Storage: rrdtool Storage: rrdtool Storage: rrdtool Storage: rrdtool Glue: nagiosgraph Glue: pnp4nagios Glue: cacti Glue: n2rrd Display: nagiosgraph Display: pnp4nagios Display: cacti Display: cacti Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 44. Which add-on(s) should I use? v0.6 ©2011 Matthew Wall, all rights reserved n2rrd rrd2graph pnp4nagios nagiosgraph cacti e ! mrtg m configuration templates templates templates parameters templates templates dependencies rrdtool, PERL rrdtool, PERL rrdtool, PERL, PHP, jQuery rrdtool, PERL h o ? ? t storage rrdtool rrdtool rrdtool rrdtool rrdtool collection immediate, batch s immediate, batch, shared library a immediate, batch SNMP SNMP display cgi php + cgi t h i cgi cgi html y zooming separate window separate window in-place separate window graph mouseovers t r yes yes e coordinate mouseovers yes arbitrary groups as yes yes search browse Ple yes yes yes yes yes Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 45. Performance Data v0.6 ©2011 Matthew Wall, all rights reserved name = value[units];[warn];[crit];[min];[max] where units is one of: unitless s,us,ms time % percentage B,KB,MB,GB,TB,PB bytes c counter Beware of the bug in Nagios 3.3.1 ! Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 46. How to see the forest and the trees? v0.6 ©2011 Matthew Wall, all rights reserved • You never know what you’ll need until long after you can save it Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 47. How to see the forest and the trees? v0.6 ©2011 Matthew Wall, all rights reserved • You never know what you’ll need until long after you can save it Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 48. How to see the forest and the trees? v0.6 ©2011 Matthew Wall, all rights reserved • You never know what you’ll need until long after you can save it Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 49. How to see the forest and the trees? v0.6 ©2011 Matthew Wall, all rights reserved • You never know what you’ll need until long after you can save it • With rrdtool, the further back you go, the more you lose Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 50. How to see the forest and the trees? v0.6 ©2011 Matthew Wall, all rights reserved • You never know what you’ll need until long after you can save it • With rrdtool, the further back you go, the more you lose Archaeology Zooming Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 51. How much to collect and save? v0.6 ©2011 Matthew Wall, all rights reserved • Collect the source data, not the derivative data • Collect everything - you can stop collecting later • Collect often - let profiling dictate when to collect less often • Save everything - you can throw it away later • Using RRD ensures that your system scales by host/service, not time Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 52. Getting the RRA parameters right v0.6 ©2011 Matthew Wall, all rights reserved DS:NAME:TYPE:HEARTBEAT:MIN:MAX RRA:CONSOLIDATION_METHOD:XFF:PDPs:CDPs DS:inOctets:COUNTER:120:0:4294967296 RRA:AVERAGE:.5:1:43200 RRA:AVERAGE:.5:5:105120 RRA:AVERAGE:.5:10:105120 XFF: x files factor PDP: primary data point CDP: consolidated data point Building a Monitoring Infrastructure with Nagios, David Josephson, 2007 Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 53. Rigid Schemas v0.6 ©2011 Matthew Wall, all rights reserved • Put one data source in each RRD file, plus associated thresholds • Use consistent service names • Use service description based on plugin, not platform • Keep the specifics of the schema in the glue layer • Schemas are not just an issue with rrdtool Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 54. So where are we? v0.6 ©2011 Matthew Wall, all rights reserved There are a few free tools, and a few more not-so-free tools Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 55. So where are we? v0.6 ©2011 Matthew Wall, all rights reserved There are a few free tools, and a few more not-so-free tools All of the existing tools suck... Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 56. So where are we? v0.6 ©2011 Matthew Wall, all rights reserved There are a few free tools, and a few more not-so-free tools All of the existing tools suck... but at least one of them is probably good enough... Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 57. So where are we? v0.6 ©2011 Matthew Wall, all rights reserved There are a few free tools, and a few more not-so-free tools All of the existing tools suck... but at least one of them is probably good enough... and many of them continue to progress. Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 58. nagiosgraph: then and now v0.6 ©2011 Matthew Wall, all rights reserved 2009 2011 Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 59. nagiosgraph: history and status v0.6 ©2011 Matthew Wall, all rights reserved • first release was 2004 - Soren Dossing • release 0.1 (2004-08-04) was 16KB (compressed) • release 1.4.4 (2011-01-16) was 158KB (compressed) • 18 project members, 2 current (Alan Brennar, Matthew Wall) • typically 70-100 downloads per day (20 on weekends) • packages for deb and rpm added Jan 2011 • 1259 unit tests providing 78.5% code coverage • 155KB perl code, 44KB javascript/css, 276KB unit test code Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 60. nagiosgraph: What next? v0.6 ©2011 Matthew Wall, all rights reserved • Arbitrary combinations of data sources • Interactive manipulation of data sources • Management of stale data • Export of data • Template-based RRAs and graphs • Better multi-byte character support • More unit tests and code coverage Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 61. The tr/end// v0.6 ©2011 Matthew Wall, all rights reserved nagiosgraph Screenshot Circonius Dashboard Prototype Cacti Screenshot Introduction • Problem • Requirements • Components • Options • Issues • Summary Wednesday, 28 September 2011
  • 62. References v0.6 ©2011 Matthew Wall, all rights reserved • http://lancet.mit.edu/mwall/projects/nagios • http://nagiosgraph.sourceforge.net • http://www.scribd.com/doc/58991647 Building a Monitoring Infrastructure with Nagios David Josephson 2007 • https://labs.omniti.com/labs/reconnoiter Wednesday, 28 September 2011