Performance Management and Diagnostics in
Distributed Java and .NET Applications
>> Rapidly resolve performance problems across the
software application lifecycle
EXECUTIVE SUMMARY ................................................................................... 1
APPLICATION PERFORMANCE IN HETEROGENEOUS MULTI-SERVER CLUSTERED
Symptoms and Causes of Performance Problems ......................................4
Fixing Performance Problems ..................................................................5
TRADITIONAL TOOLS FOR APPLICATION PERFORMANCE MANAGEMENT ....... 7
Administrator Tools ................................................................................8
Need a Better Solution ...........................................................................8
PERFORMANCE MANAGEMENT IN APPLICATION LIFE-CYCLE ......................... 9
Application Performance Management Solution Requirements..................11
DYNATRACE DIAGNOSTICS ........................................................................ 12
Efficient Diagnostics .............................................................................13
Out-of-the-box, Extensible Diagnostics...................................................16
COMPARE YOURSELF.................................................................................. 17
CONCLUSION ........................................................................................... 19
>> rapidly resolve performance problems in distributed java & .net applications 1
Today’s complex mission critical applications run in heterogeneous multi-server environments. When these
applications falter, business productivity grinds to a halt, users are inconvenienced – costs rise and profits
Modern technologies such as Ajax, Java and .NET and approaches such as SOA, EAI, and MDA enable
engineers to create and deploy applications rapidly. However, development tools generally do not enable
engineers to establish a good understanding of the application’s performance characteristics, and avoid
performance problems. Consequently, performance problems are discovered late in the application life-cycle
and have to be corrected at considerable time and expense.
In load-test and production environments, application performance management solutions typically consist
of server monitors. When performance problems occur, such monitors provide alerts, but not enough
information to diagnose the root cause because they do not look deeply inside the transaction execution to
identify the actual root cause. Due to their large overheads, development tools cannot be used in such
environments to troubleshoot the problems. As a result, IT personnel can spend hours or days trying to
reproduce and analyze these problems. Often limited by the available information, they ameliorate the
situation by adding resources or tuning at the server and system layer, without resolving the underlying
design or programming issue.
To eliminate wasted time and expense, IT organizations need a new class of application performance
management solutions to monitor and diagnose performance problems. These solutions must provide
detailed, transaction-specific diagnostic information for single and multi-server transactions. Such solutions
should support the requirements of system administrators, performance analysts, testers and developers
throughout the application life-cycle.
In contrast to traditional monitoring tools designed to detect the symptoms of performance problems by
measuring aggregate statistics at the server level, dynaTrace Diagnostics®, has been expressly designed to
not only detect but also diagnose the root cause of performance problems:
dynaTrace Diagnostics collects necessary contextual behavior data during transaction execution to
construct the transaction’s execution path, known as the PurePath®.
PurePath maps the transaction’s precise execution path, containing relevant sequence, timing,
resource usage and contextual information for each method/step the transaction executes.
If the transaction is executed on multiple servers, whether running on the same or different
machines, dynaTrace Diagnostics precisely measures and reveals the PurePath through all of these
To minimize overhead and impact on application performance, dynaTrace Diagnostics’ embedded,
dynamic, lightweight agents offload data they collect and send it to a central Diagnostics Server for
efficient, real-time, off-line analysis.
dynaTrace Diagnostics’ unique design enables IT personnel to:
Prevent performance problems by gaining a better understanding of the dynamic behavior of the
applications during development, and
Reduce time to repair by reconstructing the problem transaction quickly from captured data to
identify its root cause – enabling repair in minutes, not hours or days.
Performance Management and Diagnostics in Distributed Java and .NET Applications 1
Today, a large number of mission-critical business processes are supported by
performance sensitive applications. Developers can rapidly create such applications
without writing a lot of “infrastructure” code using frameworks such as Java EE,
.NET, Ajax and Atlas, etc. These applications can scale quickly by accessing objects
and services located on other servers through built-in remoting capabilities –
allowing application deployment in a variety of distributed multi-server clustered
configurations. SOA and EAI drive this trend further by leveraging existing
applications and services in distributed environments.
Performance While such frameworks speed development, they also hide inner workings that can
problems are contribute significantly to resource consumption, especially if such capabilities are
common in mission misused. Consequently, mission-critical applications are often deployed with latent
performance issues that surface later in production. Industry surveys reveal that:
critical Java and
Among companies with $1B or more in revenues, nearly 85% experienced
incidents of performance degradation1,
40% of the unplanned downtime is due to application failures, and
The cost of down time of mission-critical applications averages over
Problem resolution Industry surveys also show that:
takes too much time
IT groups spend 24% of their time in resolving application slow-downs3, and
80% of unplanned downtime can be mitigated by application development and
operations working together4.
Clearly, IT personnel spend too much time reacting to performance problems.
Current tools are ill-suited for resolving application performance bottlenecks:
development tools are inappropriate in production environments for many reasons
including high overhead; monitoring tools detect but do not provide detailed
diagnostic information necessary to resolve performance problems.
In order to reduce the time to resolve such problems, IT
personnel need a common, easy to use, low overhead
measurement and analysis system that can efficiently collect
necessary and sufficiently detailed diagnostic data, and speed
up root cause analysis.
In this paper, we first develop the requirements for such a system and then
introduce dynaTrace Diagnostics, which has been expressly designed to detect and
diagnose performance problems throughout the application life-cycle – from
development through production – at a very low overhead.
Jean-Pierre Garbani, “Best Practices in Problem Management”, Forrester Research, June 23, 2004
Theresa Lanowitz, “Delivering Business Value Through Software Quality”, Gartner Symposium IT Expo 2004, October 17-22, 2004.
http://www.e-channelnews.com/ec_storydetail.php?ref=412807 referring to Applied Research survey commissioned by Symantec.
Theresa Lanowitz, “Delivering Business Value Through Software Quality”, Gartner Symposium IT Expo 2004, October 17-22, 2004.
Performance Management and Diagnostics in Distributed Java and .NET Applications 2
Application Performance in
Heterogeneous Multi-Server Clustered
Different transactions As discussed, today’s mission critical applications run in heterogeneous multi-server
can follow different environments. Figure 1 details an example application’s typical transaction flow,
paths in a distributed which starts at the users Web browser, traverses the Java SE/EE servers for
authentication and Web page rendering, executes business logic in the .NET
servers, accesses mainframe databases and integrates external systems through
BU T SERV
SEN E SE
Figure 1: A Transaction in multi-server, heterogeneous, clustered application environment
The arrows along the red lines indicate the transaction’s high level execution path.
The detailed execution path enumerates the processing steps (method calls, servlet
invocation, etc.) and their context through various components in sequence of the
A transaction can Various performance problems can occur during this execution. These problems can
suffer performance lead to a variety of symptoms, some during the transaction’s execution and some
problem anywhere in well after.
its execution path.
Performance Management and Diagnostics in Distributed Java and .NET Applications 3
Causes of Performance Problems
An application may present a variety of symptoms of performance problems due to
a number of different causes. The causes include non-optimal use of pre-existing
software frameworks and/or their built-in remoting capabilities, other design errors,
problems does not
coding errors, resource contention or inappropriate configuration settings. As
really address the
illustrated in Figure 2, these causes can occur anywhere along a transaction’s
Figure 2: Typical sources of performance problems in distributed applications
Performance Implications of Frameworks
Framework and Developers create modern applications rapidly by leveraging frameworks or
library code can preexisting infrastructure software libraries. Therefore, when a transaction executes,
consume significant a significant amount of library code is executed as part of the transaction’s
resources execution path. Application performance therefore depends not only on code that
inadvertently. developers write specifically for the application but also on the facilities used from
underlying libraries – and the hidden interactions among them. Therefore,
developers need to understand the dynamic behavior of the underlying code and
choose the right set of capabilities from the framework.
Performance Implications of Distributed Deployment or Remoting
Remoting can lead to For handling large transaction volumes or enhanced scalability, frameworks allow
performance multi-tier software developed in a single application server environment to be easily
deployed in multi-server distributed configurations. However, when two application
tiers communicate across server and/or machine boundaries, performance can be
significantly affected. Such degradation depends upon the serialization or data
marshalling costs and network latencies, which in turn depend upon the number of
remote calls and the data transferred per call (Figure 3). If the application is not well
designed for remoting, code running on one tier can remotely access objects
resident on other tiers automatically, resulting in an unexpectedly large number of
remote calls or data transfers.
The performance effect of such poor design is generally not apparent during
development because developers typically work with single server configurations,
and even when they work with multi-server distributed configurations, they test the
Performance Management and Diagnostics in Distributed Java and .NET Applications 4
software at low loads. Therefore, to eliminate latent performance problems due to
remoting, developers need to understand the effect of remoting by examining the
dynamic interactions of the components.
Client Application Server Application
Client Application Server Application
Figure 3: Latency introduced by remoting.
Fixing Performance Problems
Finding and fixing the Table 1 enumerates a number of symptoms of performance problems and their
root cause of the probable causes. Clearly, there can be many causes for each symptom. This implies
problem is generally that the symptom of a performance problem does not explicitly reveal the cause of
not possible without the problem. Often what is thought of as a cause is really a symptom and one may
knowing the need to drill down recursively to find the root cause.
To find the root cause, it is important – and in many cases imperative – to identify
the individual transaction(s) experiencing the performance problems and their
execution path in the environment in which they are executing. This data must be
sufficient for properly and efficiently diagnosing the problem. Without such
information, performing problem diagnosis is the same as shooting in the dark and
it is easy to jump to the wrong conclusions.
Performance Management and Diagnostics in Distributed Java and .NET Applications 5
Symptom Sample Causes
High response time Excessive resource consumption by transaction
for specific Too much synchronization wait time
transactions or Too much time to get inside the connection or server pool
most transactions Improper settings such as pool size
Excessive delay for external web-services
Erratic transaction Excessive garbage collection
response time High resource utilization
Erratic response of external web-services
Application failures Programming errors – improper error condition handling
or time outs Data specific problems
File handler exhaustion
High CPU Poor/Inefficient algorithms
utilization Poor design choices consuming significant time in underlying layers
Poor implementation – redundant work
Improper transaction routing
High memory Memory leaks
utilization or too Objects persist for unnecessarily long time
frequent garbage Pool size too large
collection Undersized system
Lots of short lived objects
High network Too many remoting calls
utilization between Too much data transfer per call – poor design; lack of cohesion
High IO Rate Too many SQL calls – improper database or query design
Pool size too large for configuration leading to thrashing
Too high Poor algorithm design - not enough parallelism
synchronization Excessive execution time for sub-transactions
delays Locks being held for too long
Excessive resource Poor algorithms
consumption by Poor design choices consuming significant time in underlying layers
transaction Poor implementation – redundant work
Too many remote calls
Too much data transfer for remote calls
Objects held for too long
Poor SQL query and/or database design
Long pool queue Too much resource consumption by transactions
or utilization Large transaction execution time for other reasons including too much
remoting, large synchronization delays or delays for external services or
Incorrectly sized pool
Table 1: Performance problem symptoms and typical causes
Performance Management and Diagnostics in Distributed Java and .NET Applications 6
Traditional Tools for Application
As noted earlier, it is important for the developers to understand the dynamic
behavior and performance characteristics of their design choices in order to build a
well performing application. When a transaction experiences performance
problems, to really fix the problems, it is critical that IT personnel do proper
diagnosis and identify the exact causes and locations of the deficiencies. Proper
tools are needed to perform the job and we therefore discuss the effectiveness of
traditional tools in preventing and eliminating performance problems. Traditional
performance problem detection and resolution tools fall into the following broad
Developer tools including debuggers, loggers and other forms of custom
instrumentation, code profilers, and
Administrator tools, which primarily include server monitors and system
Debuggers, loggers Debuggers are an integral part of the developers’ tool kit. They enable developers
and code profilers do to go through specific execution steps at a controlled pace and allow them to focus
not directly support on a specific small area of interest at a time but do not significantly enhance the
analysis of distributed overall understanding of the interactions among components and layers. Debuggers
applications and do are not suited for use in production or load-test environments because, for example,
not work in (a) they stall the application, making performance measurements impossible, (b)
they create high overhead, and (c) they require users to be expert programmers with
access to source code, and (d) one can look only at one thread at a time causing
timeouts (e.g. XA transaction timeout or servlet timeout).
Creation of custom instrumentation such as using loggers or custom output routines
requires access to the source code and the advance knowledge of what one needs
to monitor for solving the problem. While it may some times be appropriate to
instrument the code during the development phase, it is generally not practical for
solving problems found during the later phases for many reasons, including (a)
source code may need to be changed to do the necessary instrumentation, (b) since
custom instrumentation is often written as in-line code, it can potentially change the
behavior of the code, and as a result, either mask existing problems or introduce
new performance problems, (c) the analysis of the output produced by such
instrumentation is generally too laborious and time consuming, and (d) correlation
of log messages across transactions often requires too much effort or may even be
Code profilers are useful in development to understand which pieces of code
consume the most CPU and for doing some statistical code optimization. However,
their lack of support for distributed and heterogeneous application environments
and the limited insights they provide into the dynamic application behavior – in
particular due to the statistical nature of profiler output (averages and percent
distributions) and the lack of context information required to reconstruct or even
understand performance problems – prevent developers from using profilers in
diagnosing and resolving transaction performance problems discovered in later life-
Performance Management and Diagnostics in Distributed Java and .NET Applications 7
cycle phases. Further, since code profilers introduce large overheads, sometimes as
large as 10x to 10,000x, they cannot be run in real load-test or production
Monitors and system utilities provide overall usage and performance statistics on the
Monitoring tools do
server at reasonable overhead. Looking at this class of data, a server administrator
not provide sufficient
detail to eliminate the can potentially guess at the problem and tune the application or server
configuration.5 Even when the tuning action provides performance relief, it may not
root cause of the
performance necessarily address the root cause of the problem and may shift the bottleneck to
problems and force elsewhere in the system. In addition, while the aggregate data provided by a
one to alleviate the monitor may be useful to a skilled administrator in alleviating consistently or
symptoms by tuning regularly recurring symptoms, it does not help with resolving problems that appear
When monitoring utilities provide specific information about transactions, the
information is generally limited and cumbersome to obtain. For example, some
monitors require the user to specify the transactions to be monitored in-depth.
Others provide intermediate information such as servlet response time, but do not
provide contextual information about what the servlets are actually executing. Even
those monitors that attempt to provide some execution context do not provide
enough information to determine the root cause. And some limit the user to monitor
only certain transactions under certain conditions thus eliminating true application-
The situation degrades when dealing with multi-server transactions because
Monitors do not
traditional monitors measure the individual transaction’s behavior within a single
data to reconstruct server and cannot track the end-to-end execution of the transaction across multiple
servers6. Hence, an engineer has to infer transaction behavior such as transaction
scenario, limiting the routing using only aggregate, statistical information (average, max, min, etc.)
available at the server level. Consequently, the engineer can know the likelihood
ability to reproduce
that a transaction is executed on a certain server but cannot possibly identify the
problem which results
exact conditions, interactions and execution paths that led to the problem, limiting
into large time to
his ability to reconstruct the problem scenario. This lack of visibility in the
application behavior forces the engineer to identify root cause through trial and
error, resulting in long and cumbersome repair times.
The Need For a Better Solution
Clearly traditional tools are insufficient for understanding the performance
implications of software design and the application behavior in production systems.
They are, therefore, inadequate for proactively reducing the risk of performance
problems or for rapidly resolving performance problems when they do occur.
For example, increase the JDBC connection pool size or increase the heap size.
Such monitors can monitor transaction of a multi-tier application only when all tiers run in the same server. They
are unable to monitor multi-tier transactions when the tiers run on more than one server, irrespective of whether the
servers run on the same physical machine or run in a distributed environment.
Performance Management and Diagnostics in Distributed Java and .NET Applications 8
Performance Management in the
Effective application Developers tend to work in single server environments and performance problems
often go unnoticed until later in the life cycle when:
QA finishes functional testing and starts verifying performance
Performance analysts start performance and longevity testing to develop
multi-server configuration guidelines
Operations deploys the application into production
Customers complain or abandon transactions after the system has gone
Due to the complexity of performance problems, it often takes a painful amount of
time before the root cause is identified. A contentious situation such as the one
depicted in Figure 4 often results before the problem is resolved.
I cannot This
It works Come back We got What more do
in my you want?
when you can performance
the has been
enviroment. I gave you
reproduce it alert at 2AM for
problem. there for
or get more the checkout
the system gave
Fix it now!
If only, the
had a clue
Figure 4: A consequence when performance problems are hard to diagnose
To prevent such troublesome situations and to deliver a high performance system, IT
must pay attention to performance issues throughout the application life-cycle.
Performance Management and Diagnostics in Distributed Java and .NET Applications 9
Figure 5 outlines performance related roles and responsibilities of different players
and the information flow among them.
Phase Assurance Deployment
Architects, System Architects,
Players Performance Analysts
Alert on and document
Design for performance Discoverand identify Tune performance performance issues
performance issues under real-world 24x7
Specify performance conditions
Performance Report quality trends for Recognize performance
every application Optimize configuration trends
Engineering Understand dynamic component for scalability
Responsibilities application behavior Triage performance
Isolate and document Isolate and document issues
Reconstruct and performance issues performance issues
fix problems Minimize downtime
•Monitoring and Diagnostic Tools
•Recommended settings and thresholds
•Expected performance characteristics, potential bottlenecks
and key performance indicators
Information •Performance and behavior changes between versions
•Performance and scalability reports
•Dynamic components interaction characteristics
•Data for offline analysis and problem reconstruction
Instrument and Measure Transaction Behavior.
Monitor Fulfillment of Service Level Agreements.
Reveal Software’s Dynamic Behavior and Performance Implications.
Detect, Diagnose and Resolve Application Performance Problems
throughout Application Life-Cycle.
Figure 5: Performance roles, responsibilities and information flow during application life-cycle
Performance Management and Diagnostics in Distributed Java and .NET Applications 10
Application Performance Management Solution
Effective APM As one considers Figure 5 above and the implications of modern software
requires a common development frameworks and remoting discussed earlier, it becomes evident that:
tool that all IT
Developers need tools to analyze the dynamic behavior of the application
through its underlying layers for understanding the performance
implications of design alternatives,
analysts, During post-development (QA, Staging or Pre-deployment and Operational):
Non-development IT personnel, particularly Operations, need a tool to
operators, can use
automatically detect performance anomalies such as SLA violations and
capture all necessary and relevant information for problem reconstruction.
This tool should require neither programming skills nor access to the source
Non-development IT personnel need an easy to use tool for doing high-
level triage and providing full in-depth code level diagnostics information to
system architects and development.
Developers need detailed data with complete transaction context and step-
by-step execution details from production/load-test environments for off-
line diagnosis so that they can reconstruct what happened rather than
make repeated, laborious attempts to reproduce problems.
Organizations need tight cooperation and efficient, productive
communication between all stakeholders responsible for application
performance. To reinforce this point, note that IT personnel from different
groups interact tremendously throughout the application life-cycle on
performance matters (Figure 5) and recognize that non-developers typically
identify performance problems and developers resolve them.
Traditional application performance management tools do
not meet the requirements for rapidly identifying,
diagnosing and resolving the performance problems
throughout the application life-cycle.
A next generation solution is needed.
dynaTrace Diagnostics® is such a solution
Performance Management and Diagnostics in Distributed Java and .NET Applications 11
dynaTrace Diagnostics® is an application performance management solution,
which fulfills the measurement and diagnosis requirements that have been
identified in this paper. Specifically designed to support the entire application
life-cycle, dynaTrace’s PurePath® technology captures essential information for
all transactions during their execution across multiple servers in heterogeneous
distributed environments at very low overhead (Figure 6). This enables IT
Understand the dynamic behavior of the software so that, where
possible, performance problems can be prevented, and
Detect and diagnose performance problems so they can be quickly
resolved whenever they occur.
In addition to high-level performance indicators, dynaTrace Diagnostics maps
out the precise execution path - the PurePath - of each individual transaction
from its entry at the first monitored server, through all other servers where it is
the code executed by
processed, across system, technology and component boundaries.
the transaction, the
execution context PurePath uses KnowledgeSensors™ to capture all performance and relevant
and the server on context information with minimum performance overhead.
which it was
Figure 6: dynaTrace Diagnostics visualizes traces of individual transactions in distributed heterogeneous
KnowledgeSensors mark a transaction’s progress along its execution path and
identify all transaction entry points (e.g., Java Servlet invocations) and method
calls, as well as their sequence and nesting. For each transaction, the
Performance Management and Diagnostics in Distributed Java and .NET Applications 12
KnowledgeSensors record performance information like method call sequence,
arguments, return values, exceptions, log messages, elapsed time and resource
utilization statistics such as CPU usage, IO usage, network traffic, objects
created, SQL calls, remote calls, and synchronization delays; see Figure 7 below.
PurePath allows one dynaTrace Diagnostics records the PurePath for all transactions at very low
to reconstruct overhead and sends it to the Diagnostics server for analysis. This ensures that IT:
Has a complete record of execution for all transactions - to get 100%
without trial and
monitoring coverage (and not to miss issues),
Has a record of each transaction’s execution across application tiers,
servers and machines - to analyze every potential issue,
Can actually see the transaction’s execution path, and avoid guessing
by trying to follow its execution from one server to the next,
Can understand the dynamic behavior of the software,
Can determine the root cause of the problem experienced by a specific
transaction the first time – without having to pre-specify what
transactions to monitor and then wait for the problem to recur,
Can recreate problem scenarios, including problem transactions, from
recorded data and pinpoint the exact cause of performance problems
quickly, avoiding traditional, expensive trial and error approach,
Can diagnose problems in near real time or afterwards, and
Can diagnose problems off-line without loading the production systems.
com 5 s
Exp lemen s
imp Ob alls s
00 s 39
800 SOA call
0 re trans
8s arg rn val s,
k o nds
retu ption ges, e
Clic sferFu e
log ng & r
Figure 7: A transaction's PurePath shows where it experiences performance problems
dynaTrace dynaTrace Diagnostics enables engineers to diagnose problems efficiently.
Diagnostics allows Given PurePath information, they need not spend time trying to reproduce the
you to diagnose problem. They can analyze the problem by performing either:
Outside-In diagnosis beginning with an incident of a user-visible
whether they are
performance problem, such as a slow-responding transaction or user-
visible to the users or
visible exception message and drilling down until the root cause is
Performance Management and Diagnostics in Distributed Java and .NET Applications 13
Inside-Out diagnosis beginning with an internal measure of the
performance problem, such as an exception message or a method
running very slowly, identify associated transactions and drill down
through their PurePaths to identify the root cause.
Drill down through a When certain transactions do not meet service level or performance
transaction’s requirements, dynaTrace Diagnostics’ intuitive console allows IT personnel to
drill down through the transaction’s PurePath to identify the root cause(s) of
determine the root performance problems (Figure 8) such as:
cause of its
Execution steps (e.g., method calls and servlet executions) that consume
too many resources or run slowly,
Excessively called methods or servlets, even if in framework software,
Code that makes an excessive number of SQL calls or long running
Excessive wait for resources such as execution threads or connection,
Threads and locks causing synchronization delays,
Components that make excessive remote calls,
Remote calls doing excessive data transfer,
Remote method or web-services calls taking too much time,
Code where memory leaks occur, and
Code where large number of short-lived objects are created and
Figure 8: dynaTrace Diagnostics PurePath allows engineers to reconstruct the problem by viewing the exact call
sequence including performance metrics and detailed context information and unearth the root cause of the
Performance Management and Diagnostics in Distributed Java and .NET Applications 14
When internal problems such as massive memory consumption or server crashes
are encountered, SLA violations are detected, or a comparative analysis of
historical data reveals potential performance issues (Figure 9), IT personnel use
dynaTrace Diagnostics to identify transactions with a high contribution to the
symptoms. By drilling down into these transactions’ PurePaths, they can better
understand the context that leads to such high contributions. By using this context
information and the ability to recreate transactions, they can quickly identify the
Figure 9: dynaTrace Diagnostics dashboard highlights problems, allows trend analysis, and historic comparisons
such as of different application versions
For example, consider two situations which may not be visible – yet – to the
Diagnostics helps users: an exception message and a slow running method. With dynaTrace
resolve performance Diagnostics
When an exception message is found in the logs, one can identify:
• The transaction and its parameters that led to the exception,
or report them.
• Actual method call that generated the exception message,
• The parameters passed to this method call as well as to all of its
predecessors, including the parameters input by the user, and
• The root cause such as user error, insufficient error handling in
code, other logic errors, or system conditions such as out of disk
When one identifies a slow running method, one can quickly determine:
• Whether the method runs slow constantly or just from time to time,
Performance Management and Diagnostics in Distributed Java and .NET Applications 15
• Transactions that execute this method,
• Transactions that execute and bog down in this method, and
• The break-down of such transaction's execution time in this
method into the time taken by underlying method calls and
• The core method that needs to be corrected to achieve higher
dynaTrace Note that since dynaTrace Diagnostics maps the performance of individual
Diagnostics helps transactions, rather than just the aggregate performance of all transactions or a
address the problems class of transactions, it allows IT personnel to:
in right priority,
Determine a transaction’s business value by looking at its parameters,
before they affect
allowing them to prioritize different incidents and focus energies on the
most valuable issues.
Address performance issues in their infancy – when they show up for a
few transactions – before those affect a large number of users and have
a negative impact on business.
Out-of-the-box, Extensible Diagnostics
dynaTrace dynaTrace Diagnostics comes with an array of ready-to use KnowledgeSensors
Diagnostics allows for a variety of commercial and open source:
one to monitor the
run-time virtual machine environments
entire software stack
database access layers
– from the custom
application platforms and servers
application code to
run time environment remoting libraries
of the virtual machine web services stacks
for custom and
messaging libraries and frameworks covering the entire software stack
.NET application in
Custom Application Java application
C# or other languages
Spring, Toplink, Struts,
IBM WebSphere MQ,
Messaging IBM CICS Transaction Server ADO.NET, ASP.NET
BEA T3 (RMI, JMS),
IBM WebSphere, BEA WebLogic,
WebService Stacks .NET/WCF
AXIS, Web Methods, Glue
Pre-built RMI(IIOP, JRMP, HTTP(s), T3)
Knowledge Visibroker, IIOP/ORBS
Sun Java AS, IBM WebSphere,
Shipped BEA WebLogic,
Application Server Microsoft Windows AS
With JBoss, Apache Tomcat
dynaTrace Oracle AS, SAP Netweaver
Application Platform J2EE, JSEE .NET
SQL, JDBC, Hibernate SQL, ADO.NET
Sun JVM, IBM JVM,
Runtime Environment Microsoft CLR
Figure 10: KnowledgeSensors capture transaction execution through all software layers
(For up-to-date list, please visit www.dynaTrace.com)
Performance Management and Diagnostics in Distributed Java and .NET Applications 16
These pre-built KnowledgeSensors encapsulate deep knowledge enabling IT
personnel to manage performance in their environments right out of the box –
without any effort spent on customization.
For more detail into custom applications such as a policy quotation system for
an insurance company for example, developers can easily define and package
KnowledgeSensors for their own applications using dynaTrace Diagnostics’ point
and click interface, then ‘hot’ deploy them to the target environment.
For packaged applications such as SAP ERP, application developers can easily
define and package KnowledgeSensors for those applications. These packages
can either be shipped with the application or separately. Alternately, IT personnel
at the licensed organizations, or third parties, can define KnowledgeSensors for
that application without needing access to the application’s source code,
package them, and then deploy on their own.
Compare For Yourself
Earlier, we asserted that dynaTrace Diagnostics is the only solution available on
the market that meets the requirements for efficient performance problem
detection and analysis that can be used throughout the application life-cycle by
all members of the IT team. We invite you to scan Table 2 to compare other
products that you may be familiar to dynaTrace Diagnostics.
We are confident that you will agree that dynaTrace
Diagnostics fits the bill perfectly while other solutions
fall significantly short
Performance Management and Diagnostics in Distributed Java and .NET Applications 17
Key Diagnostics Capabilities
Diagnosis Depth Requirements
Capture necessary data for each individual transaction, and not just average transaction measurements, in load testing
and 24x7 production environments, enabling diagnosis of the business-critical outlier transactions.
Capture all performance and contextual data that is required for reconstructing a performance problem – thus
eliminating the need to reproduce it – and quickly identifying the code where the performance problem occurs. Such
data should include method response times, remoting performance and payload metrics, synchronization metrics,
method and Web request arguments, log messages and exceptions.
Reveal the relationships among events such as exceptions, log messages, input metrics, SQL executions and
performance threshold violations by associating them with transactions to identify the root-cause.
Analyze transaction metrics in context of server resource metrics to determine whether the performance problem is
caused by configuration issues or programming issues.
Diagnose memory leaks, even in production environments.
Precisely trace execution of each transaction across multiple servers (logical or physical) and clients to understand its
impact on each server and application component as well as to understand implications of remoting to design high
performance distributed applications using SOA, Web-Services, etc.
Application Life-Cycle Requirements
Provide real-time data to Operations, down to the code level, for each and every individual transaction for high-level
problem triage and to performance analysts and system architects for live root-cause analysis
Provide offline code-level diagnosis capabilities that enable developers and architects to interactively diagnose all
individual transactions for reconstructing, isolating and resolving the performance problem, eliminating the need to
reproduce the problem.
Capture necessary performance data in QA and production environments and transfer the information to engineering
for analysis, potentially on another system, eliminating the need for having developers on site to debug performance
problems or for having to spend significant amount of time on reproducing the problem.
Provide automated performance comparison reports, down to individual transactions and code level, among
subsequent diagnosis sessions for evaluating the success of performance tuning activities, comparing different
application versions and configurations and understanding the root-cause of the differences.
Enable engineers, architects and performance analysts to define measurement granularity, so that they get from QA
and operations exactly what they need.
Store and maintain the performance data for long term historical and trend analysis.
Integrate with IDEs, automated build and test systems, load testing tools, issue-tracking systems and enterprise
management systems to enhance the productivity of IT personnel throughout the application-life-cycle.
Deployment and Operational Requirements
Configuration-free agents for automated, centralized deployment.
Centralized management of agents with automated and real-time remote configuration updates to quickly and easily
adapt the depth and granularity of captured diagnostics data on the fly, without having to restart the application.
Auto-discover application components for out-of-the-box diagnosis results and intuitive customization.
Continuous measurement and diagnosis in load testing and 24x7 production environments through lightweight agent
technology at negligible CPU overheads and flat memory usage of a few megabytes.
Monitor service levels at individual transaction level and alert on violations. Automatically capture history of all
transactions including deep diagnostics data for off-line root-cause analysis to eliminate the need for problem
Map transactions to requests, users and application functionality to prioritize problem resolution based on business
User Interface and Usability Requirements
Simple, intuitive yet comprehensive and responsive user interface that does not require detailed programming
knowledge but still provides information that programming experts can use.
Uses nomenclature and presents statistics that are relevant to and usable by all members of the IT team, whether they
are developers, testers, system architects or server administrators.
Serves as the common solution to be used by developers and non-developers for capturing, storing and analyzing
performance data throughout the application life-cycle, reducing time to repair.
Table 2: Comparison tool for evaluating application performance management solutions
Performance Management and Diagnostics in Distributed Java and .NET Applications 18
Consistently delivering high application performance in today’s complex multi-server heterogeneous
distributed environments is a daunting task. The ability to diagnose and resolve performance problems
rapidly is critical to achieve this goal. Therefore, architects and managers should think beyond traditional
performance management paradigms and establish effective systems and processes throughout the entire
application life-cycle. The ability to
Capture, at production-safe overhead, the detailed execution information for each transaction,
during its execution,
Reliably reconstruct problem scenarios from the captured information, and
Quickly analyze this information to determine true root cause
are keys to fixing application performance problems quickly and easily.
With its innovative PurePath instrumentation technology, low-overhead dynamic monitoring, intuitive user
interface featuring end-to-end visualization and analysis capabilities, and integration with IDEs and
enterprise management frameworks, dynaTrace Diagnostics truly represents the next generation of solutions
explicitly designed for use by all IT personnel throughout the application life-cycle.
dynaTrace Diagnostics goes far beyond monitoring and enables IT to take
productive and efficient action to fix performance problems.
dynaTrace Diagnostics enables IT to:
Study applications’ dynamic behavior during development to eliminate redundant calls, inefficient
objects and algorithms, and tune caches and configurations,
Fix the root cause of performance problems rather than mitigate or hide them by system tuning
Unearth poorly performing transactions even when overall averages are within acceptable range and
allow engineers to take corrective action before problems explode on a large scale,
Focus their energies on addressing troublesome transactions or hotspots, rather than trying
unnecessarily to reduce overall average response times at considerable expense,
Identify and focus on business-critical applications or transactions rather than working harder to
improve the performance of all transactions using the same server,
Perform their investigation offline without having to pre-define what data needs to be saved and
without having to spend lot of time reproducing the problem,
Give the recorded data to the engineers for analysis, allowing everyone to focus on their primary
duties, and thus,
Bridge the communication gap between system administrators, testers, performance analysts and
Consequently, dynaTrace Diagnostics proactively averts performance problems and reduces time-to-repair.
Its life-cycle-centric design enables IT personnel to work together efficiently and effectively to deliver high
performance consistently in complex heterogeneous multi-server clustered systems.
We invite you to learn more at www.dynaTrace.com.
Performance Management and Diagnostics in Distributed Java and .NET Applications 19
Headquarter EMEA: dynaTrace software GmbH
Freistädter Str. 313, 4040 Linz, Austria/Europe, T+ 43 (732) 908208, F +43 (732) 210100.008
Headquarter North America: dynaTrace software Inc,
West Street 200, Waltham, MA 02451, USA, T +1 (339) 9330317 F +1 (781) 2075365
All rights reserved
dynaTrace software is a registered trademark of dynaTrace software GmbH. All other marks and
names mentioned herein may be trademarks of other respective companies. (070522)
anagement and Diagnostics in Distributed Java and .NET Applications 20