SlideShare a Scribd company logo
Bryan Heden 
Lead Solutions Provider 
bheden@agilenetworks.com
Introduction & Agenda 
• What we do and what we needed 
• Customized and configured 
• IO issues... 
• Offload all the things! 
• What did we learn? 
• What’s next?
Agile Networks: Who We Are 
Telecom Provider 
• We Engineer and Operate The Agile Network, a general purpose 
backhaul network with Last-Mile AgilityTM 
• Who We Serve 
• Public Sector … particularly Public Safety 
• Oil/Gas 
• Underserved Communities 
• Enterprise
Customer Examples
What We Needed 
General insight into network health 
• Ability to maintain SLAs with customers 
• To react to network downtime as fast as possible 
• The Government doesn’t like to wait 
• To monitor traffic across the network
What We Did 
We chose Nagios XI 
• Easy to use and understand interface 
• No more text based configurations to manage 
(Haha, just kidding!) 
• Built on top of something we were already comfortable with
How'd That Go? 
It worked, but not exactly how we wanted it to 
• “WHAT DO YOU MEAN IT DOESN'T AUTOMATICALLY TRANSLATE OIDS 
INTO HUMAN UNDERSTANDABLE ENGLISH?” 
• “YOU MEAN TO TELL ME THAT OUR EQUIPMENT DOESN'T COME 
STANDARD WITH NAGIOS PLUGINS OR THAT NAGIOS DOESN'T 
PRODUCE ONE FOR EACH TYPE OF DEVICE WE USE?!” 
• Etc. 
• Ping worked just fine
If You Build It.. 
..The Network Engineers will use it 
• We wrote our own configuration wizards for each different type of 
device (PTP, PTMP, Routers, Power, GPS) We made some maps 
• Executives love maps! 
• One map tracked health of devices/links between sites along with 
radar 
• Another map tracked the operating frequencies of active devices
Finally, Some Pictures! 
The NOC Overview MAP provides 
our teams insight into the health of 
every node and their connections on 
our network.
More Pictures 
Our Network Engineers 
can see from a central 
source what the health 
and operating frequencies 
are of our equipment.
And More Pictures 
My custom built 
configuration wizards keep 
our teams working on what 
they need to work on and 
allow me to be hands off 
with system additions.
Stress Testing in Production 
We reached maximum occupancy 
• Our existing server setup wasn't meant for active checks for this 
many hosts and services 
• We introduced ModGearman 
• We offloaded MySQL 
• Things got better, but we still had some problems...
IO is a Major Factor 
Lots of writes, not enough throughput 
• There were suddenly more host and service checks than we could 
handle with our setup 
• Running on a VM on an ESX Host with 2x10K drives in a RAID1 
• Bandwidth was only graphing once every 10 to 20 minutes 
• Upgraded the ESX Host drives to 6x15K RAID10 
• Okay, okay! We upgraded some other stuff on the ESX Host, too 
• This was the single most important decision we had made
But We Didn't Stop There! 
We offloaded MRTG 
• Set up NFS Share for /var/lib/mrtg so that Nagios could read from it 
• Set up NFS Share for /etc/mrtg so that Nagios could write to it, in 
order to add host configuraiton files 
• Put both Virtual Machines on the same Host (17 Gb/sec network 
throughput)
MRTG 
MRTG had some issues of its own… 
• We had to split the cron job into separate processes 
• This stops MRTG from taking too long to complete its checks, 
preventing the next process from starting 
• (Remember the 5 to 20 minute graphing issue a few slides back?)
Pictures of Text 
Here is what MRTG’s cron file looks like after we’ve made our changes:
MRTG Process Splitting 
How we did it 
• Split the configuration files into logical chunks by size and created 
separate cron entries for each 
• /etc/mrtg/conf.d/ has multiple subdirectories (1/, 2/, 3/, etc.) 
• Each corresponding process in cron loads the configuration files 
present in those directories (Include: /etc/mrtg/conf.d/X/*.cfg) 
• We measure each process separately (run time, errors, standard 
output, logs)
What Else? 
We did some other things, too… 
• We installed and offloaded SmokePing 
• We created a SmokePing Nagios XI component to increase visibility 
of our graphs in our NOC 
• We built a portal to SmokePing for a particular client to login and 
check device health 
• We created a ModGearman Nagios XI component to manage our 
servers from a central location
SmokePing Component 
This component keeps 
our gateway graphs up 
at all times so we can 
keep an eye on them, 
and then rotates graphs 
from other hosts in 
each zone so we can 
(hopefully) notice 
inconsistencies when 
they arise.
SmokePing Portal 
We use a portal that parses the config file for SmokePing hosts, pings 
them, and shows current status. It also allows the portal user to ping 
those hosts.
ModGearman Component 
I was tired of having to repeatedly log 
in to each ModGearman instance to 
tweak something when we were still 
getting everything set! So I wrote this 
to make my life a little bit easier.
Conclusion 
What did we learn? 
• Nagios XI can be extended far beyond the default behavior 
• Custom Configuration Wizards, Plugins and Components 
• Custom MRTG installations and scans used in the Config Wizards 
• IO will become an issue, and should be planned for 
• How to build a process for creating customizations 
• Offload what you can!
What’s Next? 
Big Plans! 
● Automating the MRTG Process Splitting 
● Releasing a generic and well documented Configuration Wizard 
Template 
● Continuing to grow and expand our current installation 
Questions 
● Do you have any?

More Related Content

What's hot

OSMC 2014: Naemon 1, 2, 3, N | Andreas Ericsson
OSMC 2014: Naemon 1, 2, 3, N | Andreas EricssonOSMC 2014: Naemon 1, 2, 3, N | Andreas Ericsson
OSMC 2014: Naemon 1, 2, 3, N | Andreas Ericsson
NETWAYS
 
Demo gods are (not) on our side
Demo gods are (not) on our sideDemo gods are (not) on our side
Demo gods are (not) on our side
Markus Jura
 
Oracle SOA Suite Performance Tuning- UKOUG Application Server & Middleware SI...
Oracle SOA Suite Performance Tuning- UKOUG Application Server & Middleware SI...Oracle SOA Suite Performance Tuning- UKOUG Application Server & Middleware SI...
Oracle SOA Suite Performance Tuning- UKOUG Application Server & Middleware SI...
C2B2 Consulting
 
Serena Release Management approach and solutions
Serena Release Management approach and solutionsSerena Release Management approach and solutions
Serena Release Management approach and solutions
Softmart
 
Spark 1.0
Spark 1.0Spark 1.0
Spark 1.0
Jatin Arora
 
Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...
Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...
Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...
Core Security
 
Managing and Monitoring Application Performance
Managing and Monitoring Application PerformanceManaging and Monitoring Application Performance
Managing and Monitoring Application Performance
Sebastian Marek
 
PuppetConf 2016: Why Network Automation Matters, and What You Can Do About It...
PuppetConf 2016: Why Network Automation Matters, and What You Can Do About It...PuppetConf 2016: Why Network Automation Matters, and What You Can Do About It...
PuppetConf 2016: Why Network Automation Matters, and What You Can Do About It...
Puppet
 
Mistral Atlanta design session
Mistral Atlanta design sessionMistral Atlanta design session
Mistral Atlanta design session
Renat Akhmerov
 
Clovaを支える技術 機械学習配信基盤のご紹介
Clovaを支える技術 機械学習配信基盤のご紹介Clovaを支える技術 機械学習配信基盤のご紹介
Clovaを支える技術 機械学習配信基盤のご紹介
LINE Corporation
 
Backend, Simplified - A sane look on the mobile backend world, Nir Orpaz, Mob...
Backend, Simplified - A sane look on the mobile backend world, Nir Orpaz, Mob...Backend, Simplified - A sane look on the mobile backend world, Nir Orpaz, Mob...
Backend, Simplified - A sane look on the mobile backend world, Nir Orpaz, Mob...
DroidConTLV
 
The Art and Zen of Managing Nagios With Puppet
The Art and Zen of Managing Nagios With PuppetThe Art and Zen of Managing Nagios With Puppet
The Art and Zen of Managing Nagios With Puppet
Mike Merideth
 
URP? Excuse You! The Three Metrics You Have to Know
URP? Excuse You! The Three Metrics You Have to Know URP? Excuse You! The Three Metrics You Have to Know
URP? Excuse You! The Three Metrics You Have to Know
confluent
 
Parameter Inconsistency and Auto Correction
Parameter Inconsistency and Auto CorrectionParameter Inconsistency and Auto Correction
Parameter Inconsistency and Auto Correction
Ahmet Ozturk
 
LandsEnd TechEd2016 (1)
LandsEnd TechEd2016 (1)LandsEnd TechEd2016 (1)
LandsEnd TechEd2016 (1)Lisa Lawver
 
Android Application Optimization: Overview and Tools - Oref Barad, AVG
Android Application Optimization: Overview and Tools - Oref Barad, AVGAndroid Application Optimization: Overview and Tools - Oref Barad, AVG
Android Application Optimization: Overview and Tools - Oref Barad, AVG
DroidConTLV
 
Stress driven development
Stress driven developmentStress driven development
Stress driven development
mitesh_sharma
 
Easy database migrations with C# and FluentMigrator
Easy database migrations with C# and FluentMigratorEasy database migrations with C# and FluentMigrator
Easy database migrations with C# and FluentMigratorSafal Mahat
 
Accela NSN Site NodeB Rehome
Accela NSN Site NodeB RehomeAccela NSN Site NodeB Rehome
Accela NSN Site NodeB Rehome
Ahmet Ozturk
 
Intro to.net core 20170111
Intro to.net core   20170111Intro to.net core   20170111
Intro to.net core 20170111
Christian Horsdal
 

What's hot (20)

OSMC 2014: Naemon 1, 2, 3, N | Andreas Ericsson
OSMC 2014: Naemon 1, 2, 3, N | Andreas EricssonOSMC 2014: Naemon 1, 2, 3, N | Andreas Ericsson
OSMC 2014: Naemon 1, 2, 3, N | Andreas Ericsson
 
Demo gods are (not) on our side
Demo gods are (not) on our sideDemo gods are (not) on our side
Demo gods are (not) on our side
 
Oracle SOA Suite Performance Tuning- UKOUG Application Server & Middleware SI...
Oracle SOA Suite Performance Tuning- UKOUG Application Server & Middleware SI...Oracle SOA Suite Performance Tuning- UKOUG Application Server & Middleware SI...
Oracle SOA Suite Performance Tuning- UKOUG Application Server & Middleware SI...
 
Serena Release Management approach and solutions
Serena Release Management approach and solutionsSerena Release Management approach and solutions
Serena Release Management approach and solutions
 
Spark 1.0
Spark 1.0Spark 1.0
Spark 1.0
 
Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...
Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...
Access Assurance Suite Tips & Tricks - Lisa Lombardo Principal Architect Iden...
 
Managing and Monitoring Application Performance
Managing and Monitoring Application PerformanceManaging and Monitoring Application Performance
Managing and Monitoring Application Performance
 
PuppetConf 2016: Why Network Automation Matters, and What You Can Do About It...
PuppetConf 2016: Why Network Automation Matters, and What You Can Do About It...PuppetConf 2016: Why Network Automation Matters, and What You Can Do About It...
PuppetConf 2016: Why Network Automation Matters, and What You Can Do About It...
 
Mistral Atlanta design session
Mistral Atlanta design sessionMistral Atlanta design session
Mistral Atlanta design session
 
Clovaを支える技術 機械学習配信基盤のご紹介
Clovaを支える技術 機械学習配信基盤のご紹介Clovaを支える技術 機械学習配信基盤のご紹介
Clovaを支える技術 機械学習配信基盤のご紹介
 
Backend, Simplified - A sane look on the mobile backend world, Nir Orpaz, Mob...
Backend, Simplified - A sane look on the mobile backend world, Nir Orpaz, Mob...Backend, Simplified - A sane look on the mobile backend world, Nir Orpaz, Mob...
Backend, Simplified - A sane look on the mobile backend world, Nir Orpaz, Mob...
 
The Art and Zen of Managing Nagios With Puppet
The Art and Zen of Managing Nagios With PuppetThe Art and Zen of Managing Nagios With Puppet
The Art and Zen of Managing Nagios With Puppet
 
URP? Excuse You! The Three Metrics You Have to Know
URP? Excuse You! The Three Metrics You Have to Know URP? Excuse You! The Three Metrics You Have to Know
URP? Excuse You! The Three Metrics You Have to Know
 
Parameter Inconsistency and Auto Correction
Parameter Inconsistency and Auto CorrectionParameter Inconsistency and Auto Correction
Parameter Inconsistency and Auto Correction
 
LandsEnd TechEd2016 (1)
LandsEnd TechEd2016 (1)LandsEnd TechEd2016 (1)
LandsEnd TechEd2016 (1)
 
Android Application Optimization: Overview and Tools - Oref Barad, AVG
Android Application Optimization: Overview and Tools - Oref Barad, AVGAndroid Application Optimization: Overview and Tools - Oref Barad, AVG
Android Application Optimization: Overview and Tools - Oref Barad, AVG
 
Stress driven development
Stress driven developmentStress driven development
Stress driven development
 
Easy database migrations with C# and FluentMigrator
Easy database migrations with C# and FluentMigratorEasy database migrations with C# and FluentMigrator
Easy database migrations with C# and FluentMigrator
 
Accela NSN Site NodeB Rehome
Accela NSN Site NodeB RehomeAccela NSN Site NodeB Rehome
Accela NSN Site NodeB Rehome
 
Intro to.net core 20170111
Intro to.net core   20170111Intro to.net core   20170111
Intro to.net core 20170111
 

Viewers also liked

Nagios Conference 2014 - Jess Portnoy - Nagios Monitoring Kaltura - The Open ...
Nagios Conference 2014 - Jess Portnoy - Nagios Monitoring Kaltura - The Open ...Nagios Conference 2014 - Jess Portnoy - Nagios Monitoring Kaltura - The Open ...
Nagios Conference 2014 - Jess Portnoy - Nagios Monitoring Kaltura - The Open ...
Nagios
 
Nagios Conference 2014 - Andrzej Augustynowicz - Nagios With The Decision Eng...
Nagios Conference 2014 - Andrzej Augustynowicz - Nagios With The Decision Eng...Nagios Conference 2014 - Andrzej Augustynowicz - Nagios With The Decision Eng...
Nagios Conference 2014 - Andrzej Augustynowicz - Nagios With The Decision Eng...
Nagios
 
Nagios Conference 2014 - Rodrigo Faria - Developing your Plugin
Nagios Conference 2014 - Rodrigo Faria - Developing your PluginNagios Conference 2014 - Rodrigo Faria - Developing your Plugin
Nagios Conference 2014 - Rodrigo Faria - Developing your Plugin
Nagios
 
Nagios Conference 2014 - Nick Winn - Using Nagios XI to Empower Your Develope...
Nagios Conference 2014 - Nick Winn - Using Nagios XI to Empower Your Develope...Nagios Conference 2014 - Nick Winn - Using Nagios XI to Empower Your Develope...
Nagios Conference 2014 - Nick Winn - Using Nagios XI to Empower Your Develope...
Nagios
 
Nagios Conference 2014 - Troy Lea - JavaScript and jQuery - Nagios XI Tips, T...
Nagios Conference 2014 - Troy Lea - JavaScript and jQuery - Nagios XI Tips, T...Nagios Conference 2014 - Troy Lea - JavaScript and jQuery - Nagios XI Tips, T...
Nagios Conference 2014 - Troy Lea - JavaScript and jQuery - Nagios XI Tips, T...
Nagios
 
Nagios Conference 2014 - Mike Weber - Expanding NRDS Capabilities on Linux Sy...
Nagios Conference 2014 - Mike Weber - Expanding NRDS Capabilities on Linux Sy...Nagios Conference 2014 - Mike Weber - Expanding NRDS Capabilities on Linux Sy...
Nagios Conference 2014 - Mike Weber - Expanding NRDS Capabilities on Linux Sy...
Nagios
 
Nagios Conference 2014 - Sean Falzon - Nagios as a PC Health Monitor
Nagios Conference 2014 - Sean Falzon - Nagios as a PC Health MonitorNagios Conference 2014 - Sean Falzon - Nagios as a PC Health Monitor
Nagios Conference 2014 - Sean Falzon - Nagios as a PC Health Monitor
Nagios
 
Nagios Conference 2014 - Jose Marroquin - How Revenue Increased After Impleme...
Nagios Conference 2014 - Jose Marroquin - How Revenue Increased After Impleme...Nagios Conference 2014 - Jose Marroquin - How Revenue Increased After Impleme...
Nagios Conference 2014 - Jose Marroquin - How Revenue Increased After Impleme...
Nagios
 
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios
 
Nagios Conference 2014 - James Clark - Nagios Cool Tips and Tricks
Nagios Conference 2014 - James Clark - Nagios Cool Tips and TricksNagios Conference 2014 - James Clark - Nagios Cool Tips and Tricks
Nagios Conference 2014 - James Clark - Nagios Cool Tips and Tricks
Nagios
 
Nagios Conference 2014 - Tanja Lewit - Nagios and Kentix System Partners - Cr...
Nagios Conference 2014 - Tanja Lewit - Nagios and Kentix System Partners - Cr...Nagios Conference 2014 - Tanja Lewit - Nagios and Kentix System Partners - Cr...
Nagios Conference 2014 - Tanja Lewit - Nagios and Kentix System Partners - Cr...
Nagios
 
Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...
Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...
Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...
Nagios
 
Nagios Conference 2014 - Leland Lammert - Distributed Heirarchical Nagios
Nagios Conference 2014 - Leland Lammert - Distributed Heirarchical NagiosNagios Conference 2014 - Leland Lammert - Distributed Heirarchical Nagios
Nagios Conference 2014 - Leland Lammert - Distributed Heirarchical Nagios
Nagios
 
Nagios Conference 2014 - Frank Pantaleo - Nagios Monitoring of Netezza Databases
Nagios Conference 2014 - Frank Pantaleo - Nagios Monitoring of Netezza DatabasesNagios Conference 2014 - Frank Pantaleo - Nagios Monitoring of Netezza Databases
Nagios Conference 2014 - Frank Pantaleo - Nagios Monitoring of Netezza Databases
Nagios
 
Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in Nagios
Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in NagiosNagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in Nagios
Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in Nagios
Nagios
 
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing NagiosNagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios
 
Nagios Conference 2014 - Nate Broderick - SLA - The Marriage of an Effective ...
Nagios Conference 2014 - Nate Broderick - SLA - The Marriage of an Effective ...Nagios Conference 2014 - Nate Broderick - SLA - The Marriage of an Effective ...
Nagios Conference 2014 - Nate Broderick - SLA - The Marriage of an Effective ...
Nagios
 
Nagios Conference 2014 - Fernando Covatti - Nagios in Power Transmission Util...
Nagios Conference 2014 - Fernando Covatti - Nagios in Power Transmission Util...Nagios Conference 2014 - Fernando Covatti - Nagios in Power Transmission Util...
Nagios Conference 2014 - Fernando Covatti - Nagios in Power Transmission Util...
Nagios
 
Nagios Conference 2014 - Abbas Haider Ali - Proactive Alerting and Intelligen...
Nagios Conference 2014 - Abbas Haider Ali - Proactive Alerting and Intelligen...Nagios Conference 2014 - Abbas Haider Ali - Proactive Alerting and Intelligen...
Nagios Conference 2014 - Abbas Haider Ali - Proactive Alerting and Intelligen...
Nagios
 
Nagios Conference 2014 - Andy Brist - Intro to Incident Manager
Nagios Conference 2014 - Andy Brist - Intro to Incident ManagerNagios Conference 2014 - Andy Brist - Intro to Incident Manager
Nagios Conference 2014 - Andy Brist - Intro to Incident Manager
Nagios
 

Viewers also liked (20)

Nagios Conference 2014 - Jess Portnoy - Nagios Monitoring Kaltura - The Open ...
Nagios Conference 2014 - Jess Portnoy - Nagios Monitoring Kaltura - The Open ...Nagios Conference 2014 - Jess Portnoy - Nagios Monitoring Kaltura - The Open ...
Nagios Conference 2014 - Jess Portnoy - Nagios Monitoring Kaltura - The Open ...
 
Nagios Conference 2014 - Andrzej Augustynowicz - Nagios With The Decision Eng...
Nagios Conference 2014 - Andrzej Augustynowicz - Nagios With The Decision Eng...Nagios Conference 2014 - Andrzej Augustynowicz - Nagios With The Decision Eng...
Nagios Conference 2014 - Andrzej Augustynowicz - Nagios With The Decision Eng...
 
Nagios Conference 2014 - Rodrigo Faria - Developing your Plugin
Nagios Conference 2014 - Rodrigo Faria - Developing your PluginNagios Conference 2014 - Rodrigo Faria - Developing your Plugin
Nagios Conference 2014 - Rodrigo Faria - Developing your Plugin
 
Nagios Conference 2014 - Nick Winn - Using Nagios XI to Empower Your Develope...
Nagios Conference 2014 - Nick Winn - Using Nagios XI to Empower Your Develope...Nagios Conference 2014 - Nick Winn - Using Nagios XI to Empower Your Develope...
Nagios Conference 2014 - Nick Winn - Using Nagios XI to Empower Your Develope...
 
Nagios Conference 2014 - Troy Lea - JavaScript and jQuery - Nagios XI Tips, T...
Nagios Conference 2014 - Troy Lea - JavaScript and jQuery - Nagios XI Tips, T...Nagios Conference 2014 - Troy Lea - JavaScript and jQuery - Nagios XI Tips, T...
Nagios Conference 2014 - Troy Lea - JavaScript and jQuery - Nagios XI Tips, T...
 
Nagios Conference 2014 - Mike Weber - Expanding NRDS Capabilities on Linux Sy...
Nagios Conference 2014 - Mike Weber - Expanding NRDS Capabilities on Linux Sy...Nagios Conference 2014 - Mike Weber - Expanding NRDS Capabilities on Linux Sy...
Nagios Conference 2014 - Mike Weber - Expanding NRDS Capabilities on Linux Sy...
 
Nagios Conference 2014 - Sean Falzon - Nagios as a PC Health Monitor
Nagios Conference 2014 - Sean Falzon - Nagios as a PC Health MonitorNagios Conference 2014 - Sean Falzon - Nagios as a PC Health Monitor
Nagios Conference 2014 - Sean Falzon - Nagios as a PC Health Monitor
 
Nagios Conference 2014 - Jose Marroquin - How Revenue Increased After Impleme...
Nagios Conference 2014 - Jose Marroquin - How Revenue Increased After Impleme...Nagios Conference 2014 - Jose Marroquin - How Revenue Increased After Impleme...
Nagios Conference 2014 - Jose Marroquin - How Revenue Increased After Impleme...
 
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
 
Nagios Conference 2014 - James Clark - Nagios Cool Tips and Tricks
Nagios Conference 2014 - James Clark - Nagios Cool Tips and TricksNagios Conference 2014 - James Clark - Nagios Cool Tips and Tricks
Nagios Conference 2014 - James Clark - Nagios Cool Tips and Tricks
 
Nagios Conference 2014 - Tanja Lewit - Nagios and Kentix System Partners - Cr...
Nagios Conference 2014 - Tanja Lewit - Nagios and Kentix System Partners - Cr...Nagios Conference 2014 - Tanja Lewit - Nagios and Kentix System Partners - Cr...
Nagios Conference 2014 - Tanja Lewit - Nagios and Kentix System Partners - Cr...
 
Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...
Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...
Nagios Conference 2014 - Luke Groschen - Using Nagios Network Analyzer and NS...
 
Nagios Conference 2014 - Leland Lammert - Distributed Heirarchical Nagios
Nagios Conference 2014 - Leland Lammert - Distributed Heirarchical NagiosNagios Conference 2014 - Leland Lammert - Distributed Heirarchical Nagios
Nagios Conference 2014 - Leland Lammert - Distributed Heirarchical Nagios
 
Nagios Conference 2014 - Frank Pantaleo - Nagios Monitoring of Netezza Databases
Nagios Conference 2014 - Frank Pantaleo - Nagios Monitoring of Netezza DatabasesNagios Conference 2014 - Frank Pantaleo - Nagios Monitoring of Netezza Databases
Nagios Conference 2014 - Frank Pantaleo - Nagios Monitoring of Netezza Databases
 
Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in Nagios
Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in NagiosNagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in Nagios
Nagios Conference 2014 - Rob Seiwert - Graphing and Trend Prediction in Nagios
 
Nagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing NagiosNagios Conference 2014 - David Josephsen - Graphing Nagios
Nagios Conference 2014 - David Josephsen - Graphing Nagios
 
Nagios Conference 2014 - Nate Broderick - SLA - The Marriage of an Effective ...
Nagios Conference 2014 - Nate Broderick - SLA - The Marriage of an Effective ...Nagios Conference 2014 - Nate Broderick - SLA - The Marriage of an Effective ...
Nagios Conference 2014 - Nate Broderick - SLA - The Marriage of an Effective ...
 
Nagios Conference 2014 - Fernando Covatti - Nagios in Power Transmission Util...
Nagios Conference 2014 - Fernando Covatti - Nagios in Power Transmission Util...Nagios Conference 2014 - Fernando Covatti - Nagios in Power Transmission Util...
Nagios Conference 2014 - Fernando Covatti - Nagios in Power Transmission Util...
 
Nagios Conference 2014 - Abbas Haider Ali - Proactive Alerting and Intelligen...
Nagios Conference 2014 - Abbas Haider Ali - Proactive Alerting and Intelligen...Nagios Conference 2014 - Abbas Haider Ali - Proactive Alerting and Intelligen...
Nagios Conference 2014 - Abbas Haider Ali - Proactive Alerting and Intelligen...
 
Nagios Conference 2014 - Andy Brist - Intro to Incident Manager
Nagios Conference 2014 - Andy Brist - Intro to Incident ManagerNagios Conference 2014 - Andy Brist - Intro to Incident Manager
Nagios Conference 2014 - Andy Brist - Intro to Incident Manager
 

Similar to Nagios Conference 2014 - Bryan Heden - 10,000 Services Across The State of Ohio

Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Nagios
 
Serverless Compose vs hurtownia danych
Serverless Compose vs hurtownia danychServerless Compose vs hurtownia danych
Serverless Compose vs hurtownia danych
The Software House
 
Advanced web application architecture - Talk
Advanced web application architecture - TalkAdvanced web application architecture - Talk
Advanced web application architecture - Talk
Matthias Noback
 
Infrastructure as Code - Getting Started, Concepts & Tools
Infrastructure as Code - Getting Started, Concepts & ToolsInfrastructure as Code - Getting Started, Concepts & Tools
Infrastructure as Code - Getting Started, Concepts & Tools
Lior Kamrat
 
Serverless: The future of application delivery
Serverless: The future of application deliveryServerless: The future of application delivery
Serverless: The future of application delivery
Doug Vanderweide
 
Moving to microservices – a technology and organisation transformational journey
Moving to microservices – a technology and organisation transformational journeyMoving to microservices – a technology and organisation transformational journey
Moving to microservices – a technology and organisation transformational journey
Boyan Dimitrov
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
John Adams
 
Building a Small Datacenter
Building a Small DatacenterBuilding a Small Datacenter
Building a Small Datacenter
ssuser4b98f0
 
Building a Small DC
Building a Small DCBuilding a Small DC
Building a Small DC
APNIC
 
NTTs Journey with Openstack-final
NTTs Journey with Openstack-finalNTTs Journey with Openstack-final
NTTs Journey with Openstack-final
shintaro mizuno
 
Bare Metal Provisioning for Big Data - OpenStack最新情報セミナー(2016年12月)
Bare Metal Provisioning for Big Data - OpenStack最新情報セミナー(2016年12月)Bare Metal Provisioning for Big Data - OpenStack最新情報セミナー(2016年12月)
Bare Metal Provisioning for Big Data - OpenStack最新情報セミナー(2016年12月)
VirtualTech Japan Inc.
 
Microservices: The Best Practices
Microservices: The Best PracticesMicroservices: The Best Practices
Microservices: The Best Practices
Pavel Mička
 
Migrating to Windows 7 or 8 with Lenovo's Deployment Optimization Solutions
Migrating to Windows 7 or 8 with Lenovo's Deployment Optimization SolutionsMigrating to Windows 7 or 8 with Lenovo's Deployment Optimization Solutions
Migrating to Windows 7 or 8 with Lenovo's Deployment Optimization Solutions
Lenovo Business
 
How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...
PerformanceVision (previously SecurActive)
 
Icinga Web 2 is more
Icinga Web 2 is moreIcinga Web 2 is more
Icinga Web 2 is more
Icinga
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the World
jhugg
 
NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5UniFabric
 
Lumberjack: Finit's Oracle EPM - Hyperion System Monitoring Tool
Lumberjack: Finit's Oracle EPM - Hyperion System Monitoring ToolLumberjack: Finit's Oracle EPM - Hyperion System Monitoring Tool
Lumberjack: Finit's Oracle EPM - Hyperion System Monitoring Tool
finitsolutions
 
Tech trends 2018 2019
Tech trends 2018 2019Tech trends 2018 2019
Tech trends 2018 2019
Johan Norm
 

Similar to Nagios Conference 2014 - Bryan Heden - 10,000 Services Across The State of Ohio (20)

Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
 
Serverless Compose vs hurtownia danych
Serverless Compose vs hurtownia danychServerless Compose vs hurtownia danych
Serverless Compose vs hurtownia danych
 
Advanced web application architecture - Talk
Advanced web application architecture - TalkAdvanced web application architecture - Talk
Advanced web application architecture - Talk
 
Infrastructure as Code - Getting Started, Concepts & Tools
Infrastructure as Code - Getting Started, Concepts & ToolsInfrastructure as Code - Getting Started, Concepts & Tools
Infrastructure as Code - Getting Started, Concepts & Tools
 
Serverless: The future of application delivery
Serverless: The future of application deliveryServerless: The future of application delivery
Serverless: The future of application delivery
 
Moving to microservices – a technology and organisation transformational journey
Moving to microservices – a technology and organisation transformational journeyMoving to microservices – a technology and organisation transformational journey
Moving to microservices – a technology and organisation transformational journey
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Building a Small Datacenter
Building a Small DatacenterBuilding a Small Datacenter
Building a Small Datacenter
 
Building a Small DC
Building a Small DCBuilding a Small DC
Building a Small DC
 
NTTs Journey with Openstack-final
NTTs Journey with Openstack-finalNTTs Journey with Openstack-final
NTTs Journey with Openstack-final
 
Bare Metal Provisioning for Big Data - OpenStack最新情報セミナー(2016年12月)
Bare Metal Provisioning for Big Data - OpenStack最新情報セミナー(2016年12月)Bare Metal Provisioning for Big Data - OpenStack最新情報セミナー(2016年12月)
Bare Metal Provisioning for Big Data - OpenStack最新情報セミナー(2016年12月)
 
JustLetMeCode-Final
JustLetMeCode-FinalJustLetMeCode-Final
JustLetMeCode-Final
 
Microservices: The Best Practices
Microservices: The Best PracticesMicroservices: The Best Practices
Microservices: The Best Practices
 
Migrating to Windows 7 or 8 with Lenovo's Deployment Optimization Solutions
Migrating to Windows 7 or 8 with Lenovo's Deployment Optimization SolutionsMigrating to Windows 7 or 8 with Lenovo's Deployment Optimization Solutions
Migrating to Windows 7 or 8 with Lenovo's Deployment Optimization Solutions
 
How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...How to create custom dashboards in Elastic Search / Kibana with Performance V...
How to create custom dashboards in Elastic Search / Kibana with Performance V...
 
Icinga Web 2 is more
Icinga Web 2 is moreIcinga Web 2 is more
Icinga Web 2 is more
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the World
 
NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5NGENSTOR_ODA_P2V_V5
NGENSTOR_ODA_P2V_V5
 
Lumberjack: Finit's Oracle EPM - Hyperion System Monitoring Tool
Lumberjack: Finit's Oracle EPM - Hyperion System Monitoring ToolLumberjack: Finit's Oracle EPM - Hyperion System Monitoring Tool
Lumberjack: Finit's Oracle EPM - Hyperion System Monitoring Tool
 
Tech trends 2018 2019
Tech trends 2018 2019Tech trends 2018 2019
Tech trends 2018 2019
 

More from Nagios

Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best Practices
Nagios
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
Nagios
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The Hood
Nagios
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient Notifications
Nagios
 
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionMarcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Nagios
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios Plugins
Nagios
 
Dave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceDave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical Experience
Nagios
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service Checks
Nagios
 
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Nagios
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Nagios
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Nagios
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal Nagios
Nagios
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Nagios
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Nagios
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nagios
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - Features
Nagios
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - Features
Nagios
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios
 
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment OptionsNagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios
 

More from Nagios (20)

Nagios XI Best Practices
Nagios XI Best PracticesNagios XI Best Practices
Nagios XI Best Practices
 
Jesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture OverviewJesse Olson - Nagios Log Server Architecture Overview
Jesse Olson - Nagios Log Server Architecture Overview
 
Trevor McDonald - Nagios XI Under The Hood
Trevor McDonald  - Nagios XI Under The HoodTrevor McDonald  - Nagios XI Under The Hood
Trevor McDonald - Nagios XI Under The Hood
 
Sean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient NotificationsSean Falzon - Nagios - Resilient Notifications
Sean Falzon - Nagios - Resilient Notifications
 
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionMarcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
 
Janice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios PluginsJanice Singh - Writing Custom Nagios Plugins
Janice Singh - Writing Custom Nagios Plugins
 
Dave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceDave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical Experience
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service Checks
 
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios InstallationMike Guthrie - Revamping Your 10 Year Old Nagios Installation
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With Nagios
 
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
 
Eric Loyd - Fractal Nagios
Eric Loyd - Fractal NagiosEric Loyd - Fractal Nagios
Eric Loyd - Fractal Nagios
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
 
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson Opening
 
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios CoreNrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
 
Nagios Log Server - Features
Nagios Log Server - FeaturesNagios Log Server - Features
Nagios Log Server - Features
 
Nagios Network Analyzer - Features
Nagios Network Analyzer - FeaturesNagios Network Analyzer - Features
Nagios Network Analyzer - Features
 
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing NagiosNagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
 
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment OptionsNagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
 

Nagios Conference 2014 - Bryan Heden - 10,000 Services Across The State of Ohio

  • 1. Bryan Heden Lead Solutions Provider bheden@agilenetworks.com
  • 2. Introduction & Agenda • What we do and what we needed • Customized and configured • IO issues... • Offload all the things! • What did we learn? • What’s next?
  • 3. Agile Networks: Who We Are Telecom Provider • We Engineer and Operate The Agile Network, a general purpose backhaul network with Last-Mile AgilityTM • Who We Serve • Public Sector … particularly Public Safety • Oil/Gas • Underserved Communities • Enterprise
  • 5. What We Needed General insight into network health • Ability to maintain SLAs with customers • To react to network downtime as fast as possible • The Government doesn’t like to wait • To monitor traffic across the network
  • 6. What We Did We chose Nagios XI • Easy to use and understand interface • No more text based configurations to manage (Haha, just kidding!) • Built on top of something we were already comfortable with
  • 7. How'd That Go? It worked, but not exactly how we wanted it to • “WHAT DO YOU MEAN IT DOESN'T AUTOMATICALLY TRANSLATE OIDS INTO HUMAN UNDERSTANDABLE ENGLISH?” • “YOU MEAN TO TELL ME THAT OUR EQUIPMENT DOESN'T COME STANDARD WITH NAGIOS PLUGINS OR THAT NAGIOS DOESN'T PRODUCE ONE FOR EACH TYPE OF DEVICE WE USE?!” • Etc. • Ping worked just fine
  • 8. If You Build It.. ..The Network Engineers will use it • We wrote our own configuration wizards for each different type of device (PTP, PTMP, Routers, Power, GPS) We made some maps • Executives love maps! • One map tracked health of devices/links between sites along with radar • Another map tracked the operating frequencies of active devices
  • 9. Finally, Some Pictures! The NOC Overview MAP provides our teams insight into the health of every node and their connections on our network.
  • 10. More Pictures Our Network Engineers can see from a central source what the health and operating frequencies are of our equipment.
  • 11. And More Pictures My custom built configuration wizards keep our teams working on what they need to work on and allow me to be hands off with system additions.
  • 12. Stress Testing in Production We reached maximum occupancy • Our existing server setup wasn't meant for active checks for this many hosts and services • We introduced ModGearman • We offloaded MySQL • Things got better, but we still had some problems...
  • 13. IO is a Major Factor Lots of writes, not enough throughput • There were suddenly more host and service checks than we could handle with our setup • Running on a VM on an ESX Host with 2x10K drives in a RAID1 • Bandwidth was only graphing once every 10 to 20 minutes • Upgraded the ESX Host drives to 6x15K RAID10 • Okay, okay! We upgraded some other stuff on the ESX Host, too • This was the single most important decision we had made
  • 14. But We Didn't Stop There! We offloaded MRTG • Set up NFS Share for /var/lib/mrtg so that Nagios could read from it • Set up NFS Share for /etc/mrtg so that Nagios could write to it, in order to add host configuraiton files • Put both Virtual Machines on the same Host (17 Gb/sec network throughput)
  • 15. MRTG MRTG had some issues of its own… • We had to split the cron job into separate processes • This stops MRTG from taking too long to complete its checks, preventing the next process from starting • (Remember the 5 to 20 minute graphing issue a few slides back?)
  • 16. Pictures of Text Here is what MRTG’s cron file looks like after we’ve made our changes:
  • 17. MRTG Process Splitting How we did it • Split the configuration files into logical chunks by size and created separate cron entries for each • /etc/mrtg/conf.d/ has multiple subdirectories (1/, 2/, 3/, etc.) • Each corresponding process in cron loads the configuration files present in those directories (Include: /etc/mrtg/conf.d/X/*.cfg) • We measure each process separately (run time, errors, standard output, logs)
  • 18. What Else? We did some other things, too… • We installed and offloaded SmokePing • We created a SmokePing Nagios XI component to increase visibility of our graphs in our NOC • We built a portal to SmokePing for a particular client to login and check device health • We created a ModGearman Nagios XI component to manage our servers from a central location
  • 19. SmokePing Component This component keeps our gateway graphs up at all times so we can keep an eye on them, and then rotates graphs from other hosts in each zone so we can (hopefully) notice inconsistencies when they arise.
  • 20. SmokePing Portal We use a portal that parses the config file for SmokePing hosts, pings them, and shows current status. It also allows the portal user to ping those hosts.
  • 21. ModGearman Component I was tired of having to repeatedly log in to each ModGearman instance to tweak something when we were still getting everything set! So I wrote this to make my life a little bit easier.
  • 22. Conclusion What did we learn? • Nagios XI can be extended far beyond the default behavior • Custom Configuration Wizards, Plugins and Components • Custom MRTG installations and scans used in the Config Wizards • IO will become an issue, and should be planned for • How to build a process for creating customizations • Offload what you can!
  • 23. What’s Next? Big Plans! ● Automating the MRTG Process Splitting ● Releasing a generic and well documented Configuration Wizard Template ● Continuing to grow and expand our current installation Questions ● Do you have any?