Rational Configuration Design       To Prevent Irrational Problem Solving                John Murphy
Introduction           Basic             Advanced   Contacts               Parents and                          dependenci...
Our Scenario               2012   3
Contacts
Contacts        Contact                        User   Contact address for          Login account for an   support.        ...
Contacts          Contact Definition define contact {                                     define contact {      contact_na...
Contacts         User Definition define contact {                                define contact {      contact_name       ...
Contacts              LDAP/AD For Nagios Core ScriptAlias /nagios/cgi-bin "/usr/local/nagios/sbin" <Directory "/usr/local/...
Contacts Summary  Distinguish between your users and your  contacts.  Use an existing authentication source for your  user...
Hosts
Hosts  Focus on minimizing host configuration to  make automation easier.  Use templates to assign user view information. ...
Hosts      Host Definitions define host {                                         define host {      host_name     exchang...
Hosts Summary  Minimize configuration in host objects to make  automation easier.  Hostnames allow for easier maintenance ...
Services
Services   Keep services as generic as possible to   prevent the need for duplicate services.   Minimizing service templat...
Services  Service Definitions define service {                                        define service {      service_descri...
The puzzle completed                       2012   17
Services Summary  Strike a balance between your service-  templates and your service definitions.  Service groups are a ve...
Advanced
Good Parenting (or how to not get woken up 20 times at ~3am)         Parenting              Service Dependencies    Use ho...
Indirect Services     …And the art of dependencies                           A typical ESX                           monit...
Indirect Services     …And the art of dependencies                           A. Something like this                    201...
Indirect Services      …And the art of dependencies define service {                                      define servicede...
Managing Exceptions   Clearly label   exceptions in your   config.   Make sure you can   use the same solution   again if ...
Automation (or intrapreneurship ideas for the lazy)    Every piece of infrastructure is a potential data    source… make u...
Q&A
Thanks For Listening!
Upcoming SlideShare
Loading in …5
×

Nagios Conference 2012 - John Murphy - Rational Configuration Design

1,228 views

Published on

John Murphy's presentation on well designed Nagios configurations.
The presentation was given during the Nagios World Conference North America held Sept 25-28th, 2012 in Saint Paul, MN. For more information on the conference (including photos and videos), visit: http://go.nagios.com/nwcna

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,228
On SlideShare
0
From Embeds
0
Number of Embeds
36
Actions
Shares
0
Downloads
17
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Work for Kmart Australia, Server engineer, etc. Philosophical discussion, no wrong/right but sometimes… unique ways of handling problems.
  • Basics: Core triumvirate of objects. Pretend web developers (User experience first (contact/user distinction), making stuff work second (host/services)). Services: Object scaling and host/service relationship. Advanced: A brief look at more advanced topics. Parents and dependencies: How to not get spammed Exceptions: How to deal with non-uniform requests. Using network resources to automate.
  • Basics: Core triumvirate of objects. Pretend web developers (User experience first (contact/user distinction), making stuff work second (host/services)). Services: Object scaling and host/service relationship. Advanced: A brief look at more advanced topics. Parents and dependencies: How to not get spammed Exceptions: How to deal with non-uniform requests. Using network resources to automate.
  • Separate user and contact objects, so you can provide UAC and handle complex assignments in a manageable fashion. A contact is an object used to notify a person or a team of a problem. One-To-One relationship of contact to contact-group. A user is a human, a real person… a dummy account to match a Nagios login for access control. Many-To-One relationship of User to User-group and User-group to Contact-group. AD, LDAP, Database, etc integration.
  • Minimize contact template objects, reduces future work. Defining on contact will override template. Not going to touch on contact escalation.
  • User objects are basically a contact definition that will receive nothing. Separate users into specific view groups, attach this when a user needs to see something… add the contact group if he needs to be contacted.
  • For Nagios Core, XI already has an AD login component. When logon occurs, Nagios will match the apache http user context with a contact when possible. Use this to assign a user a “world view”
  • Logical configuration groupings (base groupings on OS, Location, Application). Minimize configuration in host for automation purposes and move as much as possible to templates. Assigning views to user objects.
  • Logical configuration groupings (base groupings on OS, Location, Application). Minimize configuration in host for automation purposes and move as much as possible to templates. Assigning views to user objects.
  • Use hostnames whenever possible. Regex matching hosts in host groups
  • Logical configuration groupings (base groupings on OS, Location, Application). Minimize configuration in host for automation purposes and move as much as possible to templates. Assigning views to user objects.
  • One-To-Many relationship of services to host groups Use service-groups for applications, minimize service-templates
  • Usually you can find a check period that will work for 95% of your checks. Unlike hosts, do not add contacts at the template… add them instead at the actual service definition. Services change contact frequently, hosts do not.
  • Last pieces of puzzle/complete picture Arrow directions dictate which object references which other object.
  • Jim trips on a network cable causing Europe to fail, email spammed. Ensure parents are defined and use multi-tenancy ensure service dependencies are defined when one piece of infrastructure relies on another. Indirectly monitored services = CPU usage on VMware infrastructure via VMware API.
  • A happy ESX environment with vSphere and working monitoring.
  • A sad ESX environment when vSphere fails and those services stop working.
  • You can use hostgroups to do broad strokes with service-dependencies.
  • Despite perfect design some one is going to “Kick your sand castle” Ensure that exceptions are properly labeled Ensure that this exception is re-usable in the future so that future exceptions will be consistent
  • Importance of naming conventions. Use AD to get computer accounts. Use virtualization API’s to get virtual infrastructure. Patching systems or resource databases. Use SNMP to get network tables/device type and/or LLDP/CDP tables to walk networks. Network management systems (I.e. Ciscoworks, NSM, etc).
  • Nagios Conference 2012 - John Murphy - Rational Configuration Design

    1. 1. Rational Configuration Design To Prevent Irrational Problem Solving John Murphy
    2. 2. Introduction Basic Advanced Contacts Parents and dependencies Managing exceptions Hosts Automation Services 2012 2
    3. 3. Our Scenario 2012 3
    4. 4. Contacts
    5. 5. Contacts Contact User Contact address for Login account for an support. actual user. Email, SMS, No contact Ticketing, etc. information. 2012 5
    6. 6. Contacts Contact Definition define contact { define contact { contact_name cu-contact name contact-user contactgroups cg-main host_notifications_enabled 1 email servers@domain.com service_notifications_enabled 1 use contact-user host_notification_period 24x7 } service_notification_period 24x7 host_notification_options d,u define contactgroup { service_notification_options c contactgroup_name cg-main host_notification_commands notify-h-email alias Kmart Contact service_notification_commands notify-s-email contactgroup_membersvg-team register 0 } } 2012 6
    7. 7. Contacts User Definition define contact { define contact { contact_name vu-jsmurphy name read-contact contactgroups vg-team host_notifications_enabled 0 use read-contact service_notifications_enabled 0 } host_notification_period none service_notification_period none define contactgroup { host_notification_options n contactgroup_name vg-team service_notification_options n alias Kmart Team host_notification_commands check_none } service_notification_commands check_none register 0 define contactgroup { } contactgroup_name cg-main alias Kmart Contact contactgroup_membersvg-team } 2012 7
    8. 8. Contacts LDAP/AD For Nagios Core ScriptAlias /nagios/cgi-bin "/usr/local/nagios/sbin" <Directory "/usr/local/nagios/sbin"> SetEnv TZ "Australia/Melbourne" Options ExecCGI AllowOverride None Order allow,deny Allow from all AuthName "Nagios Core" AuthType Basic # AuthUserFile /usr/local/nagios/etc/htpasswd.users # Require valid-user AuthBasicProvider ldap AuthName “Nagios server" AuthzLDAPAuthoritative off AuthLDAPBindDN "CN=bindAccount,OU=User,DC=domain,DC=com" AuthLDAPBindPassword xxxxxxxxx AuthLDAPURL ldaps://domain.com/OU=User,DC=Domain,DC=com?sAMAccountName?sub? (objectClass=user) AuthLDAPGroupAttribute member AuthLDAPGroupAttributeIsDN on Require ldap-group CN=NagiosAccessGroup,OU=Groups,DC=domain,DC=com </Directory> 2012 8
    9. 9. Contacts Summary Distinguish between your users and your contacts. Use an existing authentication source for your user logins. Consider the end-user experience… try to ensure it’s easy to get the information they need. 2012 9
    10. 10. Hosts
    11. 11. Hosts Focus on minimizing host configuration to make automation easier. Use templates to assign user view information. Create host groups based on shared monitoring profiles. 2012 11
    12. 12. Hosts Host Definitions define host { define host { host_name exchange01 name srv-template use srv-template alias Server host template alias Exchange server check_command check_icmp!250.0,60%! address exchange01 500.0,80% parents switch001,switch002 max_check_attempts 3 hostgroups srv-exchange, srv-windows check_interval 10 icon_image exchange.png retry_interval 2 register 1 check_period 24x7 } contact_groups cg-main notification_interval 60 notification_period 24x7 define hostgroup { notification_options d,f hostgroup_name srv-windows notifications_enabled 1 alias Windows group register 0 } } 2012 12
    13. 13. Hosts Summary Minimize configuration in host objects to make automation easier. Hostnames allow for easier maintenance than IP addresses. Create logical host-groupings that will make service assignment easier e.g. OS type, Location, Applications it serves. 2012 13
    14. 14. Services
    15. 15. Services Keep services as generic as possible to prevent the need for duplicate services. Minimizing service templates allows for easier management and baseline changes. Use service groups for applications. 2012 15
    16. 16. Services Service Definitions define service { define service { service_description Windows C: usage name main-service-template use main-service-template service_description main service template hostgroup_name srv-windows,srv-v-windows max_check_attempts 3 check_command check_interval 10 check_nt!USEDDISKSPACE!-w 80 -c 90 retry_interval 2 contact_groups cg-main,cg-main-SMS check_period 24x7 register 1 notification_interval 60 } notification_period 24x7 notification_options c register 0 } 2012 16
    17. 17. The puzzle completed 2012 17
    18. 18. Services Summary Strike a balance between your service- templates and your service definitions. Service groups are a very useful feature when used appropriately, used inappropriately they are an administrative burden. Device life-cycle happens, ensure your configuration isn’t burdened by over- complexity. 2012 18
    19. 19. Advanced
    20. 20. Good Parenting (or how to not get woken up 20 times at ~3am) Parenting Service Dependencies Use host parenting. Parent indirectly Use host parenting. monitored services with service Use host parenting. dependencies. 2012 20
    21. 21. Indirect Services …And the art of dependencies A typical ESX monitoring setup… Q. But what happens when the vSphere server fails? 2012 21
    22. 22. Indirect Services …And the art of dependencies A. Something like this 2012 22
    23. 23. Indirect Services …And the art of dependencies define service { define servicedependency { host_name vSphereServer dependent_hostgroup_name srv-v-windows service_description Ping dependency dependent_service_description CPU Usage use main-service-template host_name vSphereServer check_command check_ping!100,80%!200,90% service_description Ping dependency register 1 inherits_parent 1 } execution_failure_criteria w,u,c,p notification_failure_criteria w,u,c define service { dependency_period 24x7 service_description CPU Usage } use main-service-template hostgroup_name srv-v-windows check_command check_esx!CPU contact_groups cg-main register 1 } 2012 23
    24. 24. Managing Exceptions Clearly label exceptions in your config. Make sure you can use the same solution again if necessary. Image by Mike Bade: http://robotseatingpies.blogspot.com.au/2011/06/robots-dont-have-feelings_16.html 2012 24
    25. 25. Automation (or intrapreneurship ideas for the lazy) Every piece of infrastructure is a potential data source… make use of it! AD/LDAP Servers. Virtual infrastructure API’s. Patching systems. Asset databases. Network management platforms. Network LLDP/CDP tables. SNMP enabled servers. Help I’m running out of space! 2012 25
    26. 26. Q&A
    27. 27. Thanks For Listening!

    ×