SlideShare a Scribd company logo
Enterprise Application
Monitoring with
11/10/2007
Presented by James Peel
james.peel@altinity.com / www.altinity.com
1
Who am I?
• James Peel - james.peel@altinity.com
• Job: Managing Director of Altinity
- Pre-sales and implementation
- Introducing new bugs into Opsview
- Running Altinity
• Luggage: Munich airport
2
What is Opsview?
• Altinity’s monitoring software
• Abstraction layer above Nagios ‘engine’
• Provides a framework for adding new
monitoring functionality
• Released under GPL v2
http://opsview.org/
3
Objectives
• Talk through our approach to
application monitoring
• Flag issues and discuss possible
solutions
• Talk about some customer examples
4
“Network Monitoring”
5
Traditional
approach...
• Monitor network infrastructure
• Monitor servers
• Monitor network services - DNS, HTTP,
SMTP, etc
6
Thoughts
• There are a thousand tools that can do
“network monitoring”
• Just a pre-requisite for monitoring
critical applications
• Network monitoring is about technology.
Increasingly organisations want
business oriented view of their IT
systems
7
Application
Monitoring?
• Enterprise applications: eCommerce
sites, claims processing, CRM,
document management, etc
• Sit across multiple pieces of
infrastructure
• Often developed as web applications
8
Application
Monitoring
- Common elements -
9
Web Server
Presentation Layer
Monitoring requirements:
• Availability
• Performance
• Resources
• Events
10
Application Server
Business Logic
Monitoring requirements:
• Availability
• Performance
• Resources
• Events
11
Database Server
A persistent store of application data
Monitoring requirements:
• Availability
• Performance
• Resources
• Events
12
OS / Hardware
“Platform” for application
Monitoring requirements:
• Availability
• Performance
• Resources
• Events
13
Network
Infrastructure
• Content Switch (load balancer)
• Firewall
• External communication
14
Summary
• Standard monitors used in standard
ways
• Don’t actually need to know what
application does
15
Challenges
• Deciding what *not* to monitor
• Dealing with clusters...
16
Challenges - Cluster
Monitoring
• Active / active clusters
• Monitor individual elements to
ensure operation
• Represent clustered applications as
host in Nagios
• Monitor cluster via “VIP”
17
Challenges - Cluster
Monitoring
• Active / passive clusters
• Basic monitoring of individual
elements
• Use negate plugin to confirm active
node is offline
• Monitor application via “VIP” if
possible
18
Challenges - Cluster
Monitoring
Longer term approach
• use API to reflect changes to cluster
in monitoring
19
Application
Monitoring
- Application specific elements -
20
Database Instances
• Database schema
• Batch jobs
• Processing queues
• Data import / export
21
Application Software
• Deployment status
• Error queues
• Threads, memory pools, instances
• Application log files
22
Summary
• Standard monitors used in very specific
ways
• Need detailed understanding of
application
23
Challenges
• Ensuring custom monitors don’t place
unnecessary load on application
• Getting information from relevant
parties
• Building support for monitoring into
application design
24
Application
Monitoring
- Synthetic Transactions -
25
End-to-end
monitoring
• Objective is to confirm entire system
is working
• Provides a user centric view
• Are we getting expected result?
• Did transaction complete within
expected timeframe?
26
“Web applications”
Designed for use by people!
• Session based authentication
• Layout is app specific and can
change
• Objective is to confirm entire system
is working
27
“Web applications”
Example synthetic transaction plugin
1. Authenticate with website
2. Check account balance
3. Get first three market IDs
4. Get information for market IDs
28
Web services / APIs
Designed for use by machines!
• Objective is to confirm entire system
is working
• Are we getting expected result?
• Did transaction complete within
expected timeframe?
Much easier :)
29
Application
Monitoring
- System event logs -
30
System event logs
• Very popular!
• *Lots* of useful data
• Separate logs for:
operating system, database server, database instances,
batch processing jobs, data import / export, application
server, application instances, application messaging, web
server, web sites, content switch, firewall, router.
31
Log file monitoring
• Common formats (Log4J / syslog)
• Challenges:
- How do we identify “interesting”
entries?
- Volume of new log entries often high
- How much information is too much?
- Correlation between events
32
Log file monitoring
Altinity Approach - Log Daemon
• Set definition for log file type
• Create ‘rules’ file to match log entries
• Pass data back to server via NRPE
33
Log file monitoring
Altinity Approach - Lessons learned
• Start by focussing on high priority
events, don’t try and alert on
everything
• “Correlation” achieved by using
specific logs to alert on specific
events
• We don’t currently try and use rules to
process multiple log events
34
Questions?
Don’t worry, there is still more to come...
35
- We’re monitoring our application, so what now? -
Application
Monitoring
36
Displaying status
information
• Network oriented view for technical
support teams
• Pretty green lights for CEO
• Business oriented view for everyone
else
37
Hierarchical View
• Provides “10,000 meter” view of
network
• Allows engineer to drill down
exposing more detail
• Differentiates between handled and
unhandled events
38
Hierarchical View
39
Keyword View
• Displays hosts / service checks based
on keywords
• Great for application monitoring
40
Keyword View
41
Reporting
Opsview uses three databases
• Configuration database
• Runtime database - NDOUtils
• Data Warehouse (ODW)
42
Reporting
Challenges
• Figuring out correct table structure for
ODW
• Aggregating data into suitable form
for reporting
• Calculating table sizes and storage
requirements
43
Reporting
• Opsview Reports: Uses BIRT
• Business Objects
44
- Organic vs Process Driven -
Choosing right
approach
45
Organic approach to
monitoring
• Deploy servers and Opsview software
• Add hosts to system with basic
monitoring (ping)
• Start figuring out what to monitor for
each host
• Refine, review, extend, repeat
46
Issues with organic
approach
• Hard to accurately predict project
timescales
• Often hit unexpected issues mid way
through
• Easy to end up with inconsistencies in
monitoring configuration
47
Process driven
• Develop monitoring catalogue
• Refine and review catalogue until
complete
• Use catalogue to implement
monitoring system
• Take advantage of customer’s change
control processes
• Consider a parallel QA system for
testing changes
48
Monitoring
Catalogue?
• Defines monitors - input parameters,
expected output
• Defines templates
• Defines notification profiles
• Used for validating system when
implemented
49
Advantages
For application monitoring...
• Allows custom monitoring
requirements to be established in-
advance
• Flags any gaps in knowledge
• Helps avoid requirements creep
• Saves time during implementation
50
- What worked and what didn’t -
Customer Insights
51
Irish Revenue
• Security model - node in each zone
• Synthetic transaction may take 10
mins to complete
• Prod and DR environments
• Moving target because not live
52
“Health Insurance
Company”
• Had to reverse Opsview master /
slave comms
• Integration with events management
system on IBM mainframe
• Very specific application monitors
53
“Online Betting
Exchange”
• Scale of their system
• Moving to < 1 minute monitoring
interval
• Opsview API developed for their test
and QA environment
54
Questions?
Don’t panic, that really is the end...
55
Contact Information
James Peel
E: james.peel@altinity.com
T: +44 (0)870 787 9243
- Opsview: http://opsview.org/
- Altinity: http://altinity.com/
56

More Related Content

What's hot

Curiosity Software, Infuse and Kumoco present: The Democratisation of Testing
Curiosity Software, Infuse and Kumoco present: The Democratisation of TestingCuriosity Software, Infuse and Kumoco present: The Democratisation of Testing
Curiosity Software, Infuse and Kumoco present: The Democratisation of Testing
Curiosity Software Ireland
 
Custodian*24 Monitor Sales Presentation
Custodian*24 Monitor Sales PresentationCustodian*24 Monitor Sales Presentation
Custodian*24 Monitor Sales Presentation
webhostingguy
 

What's hot (20)

Webinar: APPSeCONNECT Product Release 2018 - A Sneak Peek at Cloud Integration
Webinar: APPSeCONNECT Product Release 2018 - A Sneak Peek at Cloud IntegrationWebinar: APPSeCONNECT Product Release 2018 - A Sneak Peek at Cloud Integration
Webinar: APPSeCONNECT Product Release 2018 - A Sneak Peek at Cloud Integration
 
Design Like a Pro: Alarm Management
Design Like a Pro: Alarm ManagementDesign Like a Pro: Alarm Management
Design Like a Pro: Alarm Management
 
Curiosity Software, Infuse and Kumoco present: The Democratisation of Testing
Curiosity Software, Infuse and Kumoco present: The Democratisation of TestingCuriosity Software, Infuse and Kumoco present: The Democratisation of Testing
Curiosity Software, Infuse and Kumoco present: The Democratisation of Testing
 
Application Performance Monitoring (APM)
Application Performance Monitoring (APM)Application Performance Monitoring (APM)
Application Performance Monitoring (APM)
 
Fa10 mcs-005
Fa10 mcs-005Fa10 mcs-005
Fa10 mcs-005
 
How to Quickly Create Effective Plant-Floor Screens
How to Quickly Create Effective Plant-Floor ScreensHow to Quickly Create Effective Plant-Floor Screens
How to Quickly Create Effective Plant-Floor Screens
 
Applications Performance Monitoring with Applications Manager part 1
Applications Performance Monitoring with Applications Manager part 1Applications Performance Monitoring with Applications Manager part 1
Applications Performance Monitoring with Applications Manager part 1
 
DevOps for Windows Admins
DevOps for Windows Admins DevOps for Windows Admins
DevOps for Windows Admins
 
10 Steps to Architecting a Sustainable SCADA System
10 Steps to Architecting a Sustainable SCADA System10 Steps to Architecting a Sustainable SCADA System
10 Steps to Architecting a Sustainable SCADA System
 
Custodian*24 Monitor Sales Presentation
Custodian*24 Monitor Sales PresentationCustodian*24 Monitor Sales Presentation
Custodian*24 Monitor Sales Presentation
 
Affordably Refreshing Your Water District’s Process Control
Affordably Refreshing Your Water District’s Process ControlAffordably Refreshing Your Water District’s Process Control
Affordably Refreshing Your Water District’s Process Control
 
vRealize Operation 7.5 What's new
vRealize Operation 7.5 What's newvRealize Operation 7.5 What's new
vRealize Operation 7.5 What's new
 
8 implementation notes
8 implementation notes8 implementation notes
8 implementation notes
 
Flip IT Data Sheet 2015
Flip IT Data Sheet 2015Flip IT Data Sheet 2015
Flip IT Data Sheet 2015
 
Demystifying SAP Connectivity to Ignition
Demystifying SAP Connectivity to IgnitionDemystifying SAP Connectivity to Ignition
Demystifying SAP Connectivity to Ignition
 
Automation in agile
Automation in agileAutomation in agile
Automation in agile
 
Securely Monitor Critical Systems From Anywhere
Securely Monitor Critical Systems From AnywhereSecurely Monitor Critical Systems From Anywhere
Securely Monitor Critical Systems From Anywhere
 
beXel lifting inspection Software
beXel lifting inspection SoftwarebeXel lifting inspection Software
beXel lifting inspection Software
 
Lisbon Mulesoft Meetup - Logging Aggregation & Visualization
Lisbon Mulesoft Meetup - Logging Aggregation & VisualizationLisbon Mulesoft Meetup - Logging Aggregation & Visualization
Lisbon Mulesoft Meetup - Logging Aggregation & Visualization
 
Design Like a Pro: SCADA Security Guidelines
Design Like a Pro: SCADA Security GuidelinesDesign Like a Pro: SCADA Security Guidelines
Design Like a Pro: SCADA Security Guidelines
 

Similar to Nagios Conference 2007 | Enterprise Application Monitoring with Nagios by James Peel

Introduction to dev ops
Introduction to dev opsIntroduction to dev ops
Introduction to dev ops
Len Bass
 
The differing ways to monitor and instrument
The differing ways to monitor and instrumentThe differing ways to monitor and instrument
The differing ways to monitor and instrument
Jonah Kowall
 
OpenERP R&D
OpenERP R&DOpenERP R&D
OpenERP R&D
Odoo
 

Similar to Nagios Conference 2007 | Enterprise Application Monitoring with Nagios by James Peel (20)

Zero to ten million daily users in four weeks: sustainable speed is king
Zero to ten million daily users in four weeks: sustainable speed is kingZero to ten million daily users in four weeks: sustainable speed is king
Zero to ten million daily users in four weeks: sustainable speed is king
 
Introduction to dev ops
Introduction to dev opsIntroduction to dev ops
Introduction to dev ops
 
Journey to the center of DevOps - v6
Journey to the center of DevOps - v6Journey to the center of DevOps - v6
Journey to the center of DevOps - v6
 
Itsummit2015 blizzard
Itsummit2015 blizzardItsummit2015 blizzard
Itsummit2015 blizzard
 
Nginx conference 2015
Nginx conference 2015Nginx conference 2015
Nginx conference 2015
 
SCM Migration Webinar - English
SCM Migration Webinar - EnglishSCM Migration Webinar - English
SCM Migration Webinar - English
 
Webinar: "DBMaestro: Database Enforced Change Management (DECM) tool"
Webinar: "DBMaestro: Database Enforced Change Management (DECM) tool"Webinar: "DBMaestro: Database Enforced Change Management (DECM) tool"
Webinar: "DBMaestro: Database Enforced Change Management (DECM) tool"
 
Kubernetes on the Edge / 在邊緣的K8S
Kubernetes on the Edge / 在邊緣的K8SKubernetes on the Edge / 在邊緣的K8S
Kubernetes on the Edge / 在邊緣的K8S
 
Leveraging DevOps Principles for Release and Deploy
Leveraging DevOps Principles for Release and DeployLeveraging DevOps Principles for Release and Deploy
Leveraging DevOps Principles for Release and Deploy
 
The differing ways to monitor and instrument
The differing ways to monitor and instrumentThe differing ways to monitor and instrument
The differing ways to monitor and instrument
 
Designing an unobtrusive analytics framework for monitoring java applications...
Designing an unobtrusive analytics framework for monitoring java applications...Designing an unobtrusive analytics framework for monitoring java applications...
Designing an unobtrusive analytics framework for monitoring java applications...
 
IBM BigFix Online Training
IBM BigFix Online TrainingIBM BigFix Online Training
IBM BigFix Online Training
 
Change management in hybrid landscapes
Change management in hybrid landscapesChange management in hybrid landscapes
Change management in hybrid landscapes
 
OpenERP R&D
OpenERP R&DOpenERP R&D
OpenERP R&D
 
340_18CS35_se_mod1(secab).pdf
340_18CS35_se_mod1(secab).pdf340_18CS35_se_mod1(secab).pdf
340_18CS35_se_mod1(secab).pdf
 
SCM Transformation Challenges and How to Overcome Them
SCM Transformation Challenges and How to Overcome ThemSCM Transformation Challenges and How to Overcome Them
SCM Transformation Challenges and How to Overcome Them
 
Top 5 .NET Challenges, Performance Monitoring Tips & Tricks
Top 5 .NET Challenges, Performance Monitoring Tips & TricksTop 5 .NET Challenges, Performance Monitoring Tips & Tricks
Top 5 .NET Challenges, Performance Monitoring Tips & Tricks
 
New Relic - May 2015 Meetup @ thetrainline
New Relic - May 2015 Meetup @ thetrainlineNew Relic - May 2015 Meetup @ thetrainline
New Relic - May 2015 Meetup @ thetrainline
 
Software Operation Knowledge
Software Operation KnowledgeSoftware Operation Knowledge
Software Operation Knowledge
 
MuleSoft Manchester Meetup #4 slides 11th February 2021
MuleSoft Manchester Meetup #4 slides 11th February 2021MuleSoft Manchester Meetup #4 slides 11th February 2021
MuleSoft Manchester Meetup #4 slides 11th February 2021
 

Recently uploaded

How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 

Recently uploaded (20)

Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
Studiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting softwareStudiovity film pre-production and screenwriting software
Studiovity film pre-production and screenwriting software
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
Facemoji Keyboard released its 2023 State of Emoji report, outlining the most...
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Agnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in KrakówAgnieszka Andrzejewska - BIM School Course in Kraków
Agnieszka Andrzejewska - BIM School Course in Kraków
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
AI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in MichelangeloAI/ML Infra Meetup | ML explainability in Michelangelo
AI/ML Infra Meetup | ML explainability in Michelangelo
 
GraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysisGraphAware - Transforming policing with graph-based intelligence analysis
GraphAware - Transforming policing with graph-based intelligence analysis
 
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdfA Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
A Comprehensive Appium Guide for Hybrid App Automation Testing.pdf
 

Nagios Conference 2007 | Enterprise Application Monitoring with Nagios by James Peel