Writing Nagios Plugins in Python
Upcoming SlideShare
Loading in...5
×
 

Writing Nagios Plugins in Python

on

  • 19,327 views

I introduced Nagios to an organisation in 2004 to track the availability of various servers and network resources. It has since grown into a system validity tool that takes the stress out of help ...

I introduced Nagios to an organisation in 2004 to track the availability of various servers and network resources. It has since grown into a system validity tool that takes the stress out of help desk. Using Python as a scripting language, I have created a suite of additional Nagios plugins that ensures:
* real-time entry of market rates
* end of day rate integrity
* common errors in manual spreadsheets
* success of backup processes
* validity conditions in MS SQL databases
* routine tracking of known chronic errors

Statistics

Views

Total Views
19,327
Views on SlideShare
17,663
Embed Views
1,664

Actions

Likes
3
Downloads
287
Comments
0

3 Embeds 1,664

http://exchange.nagios.org 1595
http://www.slideshare.net 65
http://translate.googleusercontent.com 4

Accessibility

Categories

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Writing Nagios Plugins in Python Writing Nagios Plugins in Python Presentation Transcript

  • Enhancing Nagios with Python Plugins Maurice Maneschi Associate Director, Risk Management Systems Oakvale Capital Limited
  • Presentation Outline
    • Risk Management Systems
    • What is Nagios
    • Why Python
    • What is a plug in
    • Specific Risks being monitored
    • Analysing reports and logs
    • Where to next
  • Risk Management Systems
    • A division of five staff
    • Supporting three key applications
    • Running on eight servers
    • Depending on 15+ other boxes spread over 3 LANs
    • Five key vendors
  • Risk Management System
    • Divisional goals
      • Key goal is application management
      • Some customer support
      • Product innovation
      • Project management
      • No time for nasty surprises
  •  
  • What is Nagios
    • Host, service, network monitoring program
    • Open source
    • Written in C
    • Runs on Linux and Apache
  • What is Nagios
    • Configured with the hosts of a network
      • How the hosts are networked
      • What key services are on the hosts
        • “PING”, SMTP, HTTP etc.
    • Application polls these at specified intervals
      • From the results of the polls, determines the state of hosts, services and networks
      • Alerts sent by email
      • Escalation, reporting, statistics and more
  • Why Python
    • Flexible
    • Efficient
    • Managable
    • Numerous, diverse libraries
    • Cross-platform
    • Huge number of code samples across the network
  • What is a plugin
    • Executable file
      • Takes parameters (preferable)
      • Prints a short status message
    • Returns an exit status of
      • 0 – all OK
      • 1 – warning
      • 2 – critical
    • Stateless
  • What is a plugin
    • Executable Python script
    • Code the test
    • Print the status line
    • Return a status
    • Easy!
  • Specific risks being monitored
    • Customer email to the help desk system has stopped
      • User issues email in directly into our help desk system for prioritisation, action and eventually billing
      • Spam periodically breaks the import agent
      • Its proprietary, so no fix in sight
      • Nagios watches the queue using POP3
  • Specific risks being monitored
  • Specific risks being monitored
  • Specific risks being monitored
    • Ratefeed is missing some rates
      • Rates feed into our system from Reuters via MS Excel
      • Some rates are critical, and human intervention is required if they are missing
      • Other rates are important, but are just tracked when missing
      • Nagios watches MS Excel file sheet with the “unreliable rates”
  • Specific risks being monitored
  • Specific risks being monitored
  • Specific risks being monitored
    • Rates must be inserted regularly
      • Insertion process has numerous dependencies
      • Moving target – causes of failure change over time
      • Focus on the end point – are the rates in the database?
      • Nagios the databases and alerts to old or missing rates
  • Specific risks being monitored
  • Specific risks being monitored
  • Specific risks being monitored
    • External source of dealing information
      • Fed in through the FIX protocol
      • Numerous failure points being monitored on a (Windows) server
      • Monitor process must check in with Nagios every 10 minutes
      • Using passive and active checks
  • Specific risks being monitored
  • Specific risks being monitored
  • Specific risks being monitored
    • Quick passive check
  • Specific risks being monitored
    • Successful backups
    • Successful scheduled tasks
    • Database comparisons
    • Common errors
      • Password server on web site
      • Known failure point on an MS Excel worksheet
  • Extra enhancements to Nagios
    • High level view to systems health
    • Audio alerts and SMSes from UTbox.net
    • Status screen on monitor PC
    • Syslogd for firewall
    • Script reuse for rate checks
    • Ad hoc system problems
      • Currently tracking WAN failures
  • Analysing reports and logs
    • Screen saver often sufficient
    • Summary views
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  • Where to next
    • Low spec-ed PC
    • Nagios is in several distro repositories
      • I compile from the source
    • Allow a day at least to configure Nagios
      • Don't expect to install and switch it on
    • Tuning Nagios is an ongoing job
  • Further information
    • Nagios: http://www.nagios.org
    • Python: http://www.python.org
      • pyexcelerator, pymssql, freetds from Sourceforge
    • Oakvale Capital: http://www.oakvale.com
    • Code samples: http://www.redwaratah.com/wiki/index.php?title=Nagios_and_Python
    • Maurice Maneschi: [email_address]