[Free OpManager training] Part 4- Network fault-management & IT automation

Week 4
Effective fault management and IT
automation

1. How to identify the faults
quickly?
2. How to prioritize the problems?

All services are
currently UP
1. How to identify the faults
quickly?
2. How to prioritize the problems?
3. How do you get it resolved
quickly?

Agenda
• Alarm severity levels
• Threshold violation alarms
• Other alarms : VMWare; Event logs; SNMP traps and Syslogs
• Notifications
• Using an IT workflow to remediate problems
• Tips and tricks
• Questions

Severity Color code
Attention
Trouble
Critical
Service down
Clear

Device down
Interface down Severity: predefined

Process down
Service down
URL down
Severity: predefined

Event log
Syslog
SNMP trap
Severity: configurable

• Configuring threshold values on an individual device
• Configuring consecutive times
• Configuring rearm value to clear fault alarms
• Using device templates to configure thresholds globally based on device type
Threshold-based alarms

VMWare alarms; Event logs; SNMP traps; Syslogs

Alarms for inventory changes
o vMotion
o Host added/removed
o Host or VMs connected/disconnected
o VMs powered on/off
o VMs orphaned
o Scheduled task removed
o Etc.
Querying more events from the Vcenter server / ESX host
VMware events

Event log alarms
Prerequisites
o Check if WMI and RPC services are enabled on the Windows servers
o Default WMI ports: 135 & 445, 5000 to 6000 (TCP)
• Configuring event logs for a Windows server in OpManager
• Ignoring a specific event log from a Windows server
• Configuring OpManager to handle event floods (http://help.opmanager.com/stopping-event-flood)
o serverparameters.conf (OpManager/conf/OpManager)
o EVENTS_PER_HOUR 1000
o EVENT_FLOOD_SEVERITY Critical

SNMP trap alarms
5things that you should know about SNMP traps in OpManager
1. Unsolicited traps
2. Varbinds
3. Failure component
4. Loading traps from MIB files
5. Forwarding trap messages to another NMS platform
OpManager
Trap-
Receiver
Router
Switch
Firewall
Server
SNMP Agent
Management
Definitions
Management
Database
Trap
(162)

#1 Unsolicited traps
I have configured a Router to forward SNMP traps to
OpManager's server. However I don't get to see an alarm?
How do I fix this?
Things to verify :
 Verify whether the Router is added to OpManager
 Verify whether the 'Trap rule' is available for the respective event
 Verify whether the trap event is listed under 'Unsolicited traps'
Solution: Identify the event from the 'Unsolicited traps' and add a new trap rule

#2 Varbinds
I have a Windows server added to OpManager. It triggers 100s of trap events with various
messages from x.x.x.x OID. However I want to filter the trap event only if the priority is
'critical' and clear the event automatically when the priority is 'low'? How do I achieve this?
Know
• What is a varbind?
• How to identify the varbinds from trap event?
Solution: Use 'match criteria' to filter and clear the trap alarms based on 'varbinds'

#3 Failure component
I have a Switch added to OpManager. It triggers a failure trap event for BGP down from .1.3.6.1.2.1.15.7.2
OID and a clear event for BGP up from .1.3.6.1.2.1.15.7.1 OID. This generates two different alarms in
OpManager. I want the clear alarm for BGP up event merged with the original alarm as it is for the same link.
How do I achieve this?
Solution: Provide a common 'failure component' in both the trap rule
It generates two different alarm because OpManager receives the
trap from two different OIDs and each one got a separate trap rule

Syslog alarms
Prerequisites
o Configure devices to forward syslog events to OpManager's server
o Default ports: 514 & 519 (UDP); configurable
• Creating a syslog rule
o Syslog receiver
• Using facility name, severity, or match text to filter and clear syslog alarms (regex format)
• Identifying the syslog flow rate from OpManager
• Forwarding syslog messages to another NMS platform

Notification
cycle
Profile type
- Send email or SMS
- Run system
command
- Run program
- Log a ticket
- Web alarm
- Syslog
- Trap
Alarm criteria
- Device down
- Service down
- Hardware fault
- Threshold violation
- Virtual device fault
- UCS fault
Device selection
- Category
- Business view
- Devices
Schedule
- All the time
- Selected time window
- Delayed trigger
- Recurring trigger
Preview
- Verify inputs
- Add a profile

#1 Email notification
Steps :
1. Configure mail server settings
2. Create a notification profile for 'email';
- Select the required 'alarm criteria'; -
Associate the profile with 'required devices';
I want to receive an email notification for all
service down alarms. How do I configure
this?

#2 Log a ticket
Steps :
1. Setting up the integration with ServiceDesk Plus
2. Create a notification profile for 'log a ticket'; - Select
the category, group and technician; - Select the
required 'alarm criteria'; - Associate the
profile with 'required devices';
I want OpManager to create a ticket in ServiceDesk Plus whenever a
problem is detected in the interface. The ticket should have the fields like
category, group and technician filled automatically.

• Get more space on the server for better performance
• Test SNMP service
• Export/ Import available templates
https://resources.manageengine.com/resources/forum/opmanager/workflows
IT workflow automation
Create a workflow Associate devices Schedule/trigger tasks
1 2 3

Tips and tricks
• Configure device dependencies to stop polling a dependent device
when its parent device is down
• Suppress known alarms from an individual device
• Configure the downtime scheduler and stop polling devices during
maintenance windows
• Configure alarm escalation and notify the super admin when a critical
alarm is not cleared within a given amount of time

youtube.com/opmanagertechvideos
help.opmanager.com
opmanager-
support@manageengine.com
+1 (888) 720-9500 / +1 (408) 916-
9400
Need more help?
forums.manageengine.com/opmanager

Free ITOM Seminar
https://www.manageengine.com/itom/seminars/chicago-la-2018.html

www.manageengine.com
THANK YOU

[Free OpManager training] Part 4- Network fault-management & IT automation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to [Free OpManager training] Part 4- Network fault-management & IT automation

Similar to [Free OpManager training] Part 4- Network fault-management & IT automation (20)

More from ManageEngine, Zoho Corporation

More from ManageEngine, Zoho Corporation (20)

Recently uploaded

Recently uploaded (20)

[Free OpManager training] Part 4- Network fault-management & IT automation