Introduction to Allmon (0.1.0) - a generic performance and availability monitoring system

A short introduction to

Allmon
a generic performance and
availability monitoring system

Tomasz Sikora – London 2009

List of topics
● Source of performance problems
● Continuous monitoring and metrics acquisition
● Allmon architecture (scalability, messaging)
● Deployment (distributed system)
● Configuration
● Analysis (use cases)
● Questions

Source of performance problems
● Lack of understanding (knowledge)
● Not well defined requirements (functional/non-functional)
● Not experienced developers
● Not well trained users
● Lack of visibility
● Developers have to understand operational aspects
● Multi-layer system monitoring is essential
● Not enought testing
● Load tests (integration, regression) + Soak tests
● Monitoring for "before-after" load test comparison
and store all results

Continuous monitoring
(metrics acquisition)
● Monitoring multi-layer
enterprice systems Monitored system

Application

● Main layers: Application, Health
Checks
Interfaces Presen-
tation tier

Services, Phisical
● Service Health Checks
Services
● Distributed system JVM Application
Server
JMS
Broker
DB Web
● Messaging: isolation, non- Container

intrusive, reliable
Physical
● What to monitor, what to OS Net
CPU/Mem/IO
analyse?
● Collected data can be used for
correlation analysis

Allmon architecture
client-side server-side
Pre-
Time synchronization
aggregates Collector Loader
and sends (allmon-client) (allmon-server)
to client-side
Aggregator Agent Receiver

Network
Aggregator Receives metric Raw Metrics
messages and Loader
Client load them to DB Server
JMS JMS
Broker Broker DB
Aggregates Decodes raw Allmetric
compress metrics and load
and add CRC them to allmetric
storage Views

Aggregate
Viewer Miner
Jpivot
Monitored Statistics
Monitored
System Views/Agg
Monitored
System
Objects Generator
System
Objects
Objects Mondrian Correlation
allmetric views Querying
schema Storing

Allmon Components
Logical overview – v.1.03a
London 2009.10.06

Allmon architecture
● Collector (distributed client-side)
● Agents (Passive/Active agents collecting metrics)
● Aggregator (common pre-aggregating and sending
data mechanism)
● Loader (centralized server-side)
● Miner (transforming data to allmetrics, aggregating)
● Viewer (presentation, multidimensional analysis)
● Data storage
● Raw data storage (staging tier)
● Allmetric schema (generic 3NF structure)
● Aggregates, pre-calculated structures (access tier)

Deployment
client-side server-side

Active
Agent
Collector
Active Monitored Passive
Monitored 1
Agent Monitored
System 1 Agent
System
System Client
Objects
Passive Objects JMS
Agent Broker Allmon server-side
Active
Agent
components

Active Loader
Agent
Collector
Active Monitored Passive
Monitored 2 Server
Agent Monitored
System 2 Agent
System JMS
System Client
Objects Broker DB
Passive Objects JMS
Agent Broker
Active
Agent
Collected metrics
Many distributed are aggregated
Viewer Miner
allmon clients and sent
continuously asynchronously
collecting metrics to allmon server
Active Passive
Agent Agent
Collector
Monitored Passive
Monitored N
Monitored
System N Agent
System
Active
System Client
Agent Objects JMS
Objects
Passive Broker Allmon Deployment Diagram
Active
Agent
Agent Logical overview – v.1.03b
London 2009.10.06

Configuration
● Client-side
● Agents configuration
– Inedpendent configuration for different types of agents
● Active agents scheduling
– Based on crontab (cron4j)
● Server-side
● Database conectivity
● Loading process scheduling
● Aggregate processes parametrisation
● Visualization set-up

Allmon analysis - use cases
Collecting multi-tier system metrics is crucial for
understanding system and finally finding performance
problems (two simple examples)
● Study case 1 - Growing JVM memory allocation (leaks) in comparison
between several releases and users activity
– Input: JVM metrics via JMX (mem, GC), application actions stats
– Output: differences in users activity, differences in application behaviour and
allocated resources
– Action: identifing areas in code base responsible for huge deltas in memory
consumption
● Study case 2 - Not efficient interactions with services and databases
– Input: database metrics, DB OS stats, application actions stats, intercepted
persistence level calls
– Output: rankings of: the longes performing application actions, the biggest product
of execution count and exection times
– Action: Easier prioritisation of areas which have to be improved

Allmon
2009
Allmon project page:
http://code.google.com/p/allmon/

Allmon user group:
http://groups.google.com/group/allmon

Code license: Apache License 2.0:
http://www.apache.org/licenses/LICENSE-2.0

Introduction to Allmon (0.1.0) - a generic performance and availability monitoring system

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (8)

Similar to Introduction to Allmon (0.1.0) - a generic performance and availability monitoring system

Similar to Introduction to Allmon (0.1.0) - a generic performance and availability monitoring system (20)

Recently uploaded

Recently uploaded (20)

Introduction to Allmon (0.1.0) - a generic performance and availability monitoring system