Automatic Detection of
Performance Design and
Deployment Antipatterns in
Component Based Enterprise
Systems
by
Trevor Pars...
CONTENTS
Abstract vii
Acknowledgements ix
List of Publications xi
1 Introduction 1
1.1 Background and Motivation . . . . ....
3 Overview of Approach 42
3.1 Approach for the Automatic Detection of Performance Antipatterns . . 44
3.1.1 Research Metho...
6.1 Antipatterns Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.2 A Rule Engine Approach for Ant...
LIST OF FIGURES
1.1 Typical Enterprise Architecture . . . . . . . . . . . . . . . . . . . . . . 3
2.1 Typical JEE Architec...
5.4 Example Run-Time Path (a), Extract of the RunTimeContainerSer-
vice Data Structure (b), and Information Extracted to P...
List of Acronyms
ADL: Architecture Description Language
AOP: Aspect Oriented Programming
API: Application Programming Inte...
ABSTRACT
Enterprise applications are becoming increasingly complex. In recent times they have
moved away from monolithic a...
Dedicated to my parents, Tommy and Kay.
ACKNOWLEDGEMENTS
Firstly I would like to thank my supervisor, John Murphy for giving me the oppor-
tunity to pursue this r...
LIST OF PUBLICATIONS
Trevor Parsons, John Murphy. Detecting Performance Antipatterns in Component Based
Enterprise Systems...
Trevor Parsons, John Murphy. A Framework for Automatically Detecting and Assessing
Performance Antipatterns in Component B...
CHAPTER
ONE
Introduction
Main Points
• Performance is a major issue during the development of enterprise applications.
• S...
1.1 Background and Motivation
In the past software developers had to be extremely careful when developing their ap-
plicat...
Figure 1.1: Typical Enterprise Architecture
volume of data produced when profiling even a single user system can be extreme...
1.2 Thesis Overview
In light of the limitations of current performance tools and of the benefits of software
antipatterns, ...
1.3 Thesis Contributions and Statement
The first major contribution of this thesis is an approach for the automatic detecti...
and performance characteristics associated with servicing a request. Such information
is utilised to detect antipatterns b...
We also assume that a realistic testing scenario is available which reflects how the sys-
tem will be used in production. W...
CHAPTER
TWO
Background
In this chapter we introduce related research areas and technologies. We begin by
discussing the re...
Main Points
• Current performance analysis techniques, e.g., modelling, are inaccurate and
time consuming when applied to ...
2.1 Performance of Software Systems
The performance of a software system has been described as an indicator of how well
th...
data from a running system (i.e. profiling) and presenting this data to the user. These
tools tend to focus on low level pr...
aim [157]. They support components conforming to certain standards (or component
models) and allow instances of these comp...
tion Server (MTS)2, EJB containers, and CCM containers. We introduce the EJB tech-
nology in the following sections. The r...
technology (JSPs) 3. Servlets are Java programming language classes that dynamically
process requests and construct respon...
same specification, application developers can replace the driver that they are using
with another one without having to re...
• Application Assembler: The application Assembler combines enterprise beans
into larger deployable application units.
• D...
descriptor details the different container services that are required by the EJB. For
example, a deployment descriptor can...
mechanism (database). A Bean-managed persistent entity bean is an entity bean that
must be persisted by hand. The componen...
and by the container when a bean is invoked.
To create an instance of an EJB a client must first obtain an instance of a EJ...
Figure 2.3: Example EJB Deployment Descriptor
2.4.4 Deployment Settings
As shown in figure 2.2 the container generated EJBO...
Figure 2.4: Stateless Session Bean Lifecycle
21
The management of an EJB’s lifecycle is a complex process and differs from bean type
to bean type. Factors which influence ...
A software architecture is important for a number of reasons. Firstly it becomes a
vehicle for communication among the sys...
2.6 Software Patterns
The current use of the term pattern in software engineering is derived from work by
Christopher Alex...
ments since they can provide a skeleton of functional behaviour while at the
same time they can explicitly address non-fun...
2.7 Software Antipatterns
Antipatterns, first suggested by Koenig [101], have been defined by Brown et al. [36]
as: ”a liter...
Figure 2.5: Patterns, Antipatterns and their Relationship
They may not necessarily be a problem but often their presence i...
usually based on statistical methods [124]. There are advantages and disadvantages
associated with both approaches. For ex...
2.8.2.1 Recording Information
Regardless of the profiling approach however, information must be recorded by the
profiling to...
Figure 2.6: JVMPI Architecture
obtain the current stack trace at this point. The main advantage of standard interfaces
is ...
they are not portable across different platforms. The java.lang.instrument interface 31
is another standard interface (as ...
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems
Upcoming SlideShare
Loading in …5
×

Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems

1,279 views

Published on

Enterprise applications are becoming increasingly complex. In recent times they have moved away from monolithic architectures to more distributed systems made up of a collection of heterogonous servers. Such servers generally host numerous soft- ware components that interact to service client requests. Component based enterprise frameworks (e.g. JEE or CCM) have been extensively adopted for building such ap- plications. Enterprise technologies provide a range of reusable services that can assist developers building these systems. Consequently developers no longer need to spend time developing the underlying infrastructure of such applications, and can instead concentrate their efforts on functional requirements.

Poor performance design choices, however, are common in enterprise applications and have been well documented in the form of software antipatterns. Design mistakes generally result from the fact that these multi-tier, distributed systems are extremely complex and often developers do not have a complete understanding of the entire ap- plication. As a result developers can be oblivious to the performance implications of their design decisions. Current performance testing tools fail to address this lack of system understanding. Most merely profile the running system and present large vol- umes of data to the tool user. Consequently developers can find it extremely difficult to identify design issues in their applications. Fixing serious design level performance problems late in development is expensive and can not be achieved through ”code op- timizations”. In fact, often performance requirements can only be met by modifying the design of the application which can lead to major project delays and increased costs.

This thesis presents an approach for the automatic detection of performance design and deployment antipatterns in enterprise applications built using component based frameworks. Our main aim is to take the onus away from developers having to sift through large volumes of data, in search of performance bottlenecks in their applica- tions. Instead we automate this process. Our approach works by automatically recon- structing the run-time design of the system using advanced monitoring and analysis techniques. Well known (predefined) performance design and deployment antipat- terns that exist in the reconstructed design are automatically detected. Results of ap- plying our technique to two enterprise applications are presented.

The main contributions of this thesis are (a) an approach for automatic detection of performance design and deployment antipatterns in component based enterprise frameworks, (b) a non-intrusive, portable, end-to-end run-time path tracing approach for JEE and (c) the advanced analysis of run-time paths using frequent sequence mining to automatically identify interesting communication patterns between com- ponents.

Published in: Software, Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,279
On SlideShare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems

  1. 1. Automatic Detection of Performance Design and Deployment Antipatterns in Component Based Enterprise Systems by Trevor Parsons The thesis is submitted to University College Dublin for the degree of PhD in the College of Engineering, Mathematical and Physical Sciences. November 2007 School of Computer Science and Informatics Dr. J. Carthy. (Head of Department) Under the supervision of Dr. J. Murphy
  2. 2. CONTENTS Abstract vii Acknowledgements ix List of Publications xi 1 Introduction 1 1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Thesis Contributions and Statement . . . . . . . . . . . . . . . . . . . . . 5 1.4 Key Assumptions and Scope . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Background 8 2.1 Performance of Software Systems . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Component Based Software . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.1 Software Components . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.2 Component Frameworks . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 The Java Enterprise Edition . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3.1 Web Tier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3.2 Business Tier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.3 Enterprise Information System Tier . . . . . . . . . . . . . . . . . 14 2.4 The Enterprise JavaBean Technology . . . . . . . . . . . . . . . . . . . . 15 2.4.1 The EJB Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.4.2 The EJB Component Model . . . . . . . . . . . . . . . . . . . . . 16 2.4.3 EJB Runtime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.4.4 Deployment Settings . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.5 Software Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.6 Software Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.7 Software Antipatterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.8 Performance Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.8.1 Workload Generation . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.8.2 Profiling Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.9 Reverse Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.10 Design Pattern and Antipattern Detection . . . . . . . . . . . . . . . . . 36 2.11 Knowledge Discovery in Databases and Data Mining . . . . . . . . . . . 39 2.11.1 Frequent Pattern Mining and Clustering . . . . . . . . . . . . . . 40 i
  3. 3. 3 Overview of Approach 42 3.1 Approach for the Automatic Detection of Performance Antipatterns . . 44 3.1.1 Research Methodology and Validation Criteria . . . . . . . . . . 45 3.1.2 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.1.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.1.4 Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.2 Antipatterns Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.2.1 Antipattern Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . 51 4 Monitoring Required for Antipattern Detection 54 4.1 Chapter Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.2 Run-Time Path Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.2.1 Run-Time Paths Overview . . . . . . . . . . . . . . . . . . . . . . 57 4.2.2 Run-Time Path Tracing Motivation . . . . . . . . . . . . . . . . . 60 4.2.3 Run-Time Path Tracing Considerations . . . . . . . . . . . . . . 61 4.2.4 COMPAS Monitoring Framework . . . . . . . . . . . . . . . . . 62 4.2.5 COMPAS Extensions . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.3 Monitoring Server Resource Usage and Extracting Component Meta- Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.3.1 Using Java Management Extensions to Monitoring Server Re- source Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.3.2 Automatically Extracting Component Meta-Data . . . . . . . . . 80 4.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4.4.1 Applications of Run-Time Paths . . . . . . . . . . . . . . . . . . . 82 4.4.2 Alternative Representations for Component Interactions . . . . 83 4.4.3 Run-Time Interaction Tracing Approaches . . . . . . . . . . . . . 84 5 Reconstructing the Systems Design for Antipattern Detection 88 5.1 Chapter Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 5.2 Automatically Extracting Component Relationships and Object Usage Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 5.3 Reconstructing Run-time Container Services . . . . . . . . . . . . . . . . 96 5.4 Identifying Component Communication Patterns in Run-Time Paths using Frequent Sequence Mining . . . . . . . . . . . . . . . . . . . . . . . 97 5.4.1 Frequent Itemset Mining and Frequent Sequence Mining . . . . 97 5.4.2 Support Counting for Run-Time Paths . . . . . . . . . . . . . . . 99 5.4.3 Further Criteria for Interestingness . . . . . . . . . . . . . . . . . 102 5.4.4 Preprocessing for FSM Performance Improvement . . . . . . . . 102 5.4.5 Closed Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 5.4.6 PostProcessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 5.4.7 Component Communication Information for the Extracted De- sign Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 5.5 Data Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 5.5.1 Clustering Run-time Paths . . . . . . . . . . . . . . . . . . . . . . 106 5.5.2 Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 107 5.6 The Reconstructed Design Model . . . . . . . . . . . . . . . . . . . . . . 107 5.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.7.1 Reverse Engineering . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.7.2 Data Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 6 Detecting Performance Design and Deployment Antipatterns 112 ii
  4. 4. 6.1 Antipatterns Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.2 A Rule Engine Approach for Antipattern Detection . . . . . . . . . . . . 114 6.3 Example Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 6.3.1 Antipattern Library . . . . . . . . . . . . . . . . . . . . . . . . . . 117 6.3.2 Filtering Using Threshold Values . . . . . . . . . . . . . . . . . . 118 6.4 PAD Tool User Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 6.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 6.5.1 Antipattern Categorisation . . . . . . . . . . . . . . . . . . . . . 119 6.5.2 Performance Testing . . . . . . . . . . . . . . . . . . . . . . . . . 120 6.5.3 Detection Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 120 6.5.4 Antipattern Detection . . . . . . . . . . . . . . . . . . . . . . . . 121 7 Results and Evaluation 124 7.1 Chapter Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 7.2 COMPAS JEEM Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 7.2.1 Deducing System structure . . . . . . . . . . . . . . . . . . . . . 127 7.2.2 Portability Assessment . . . . . . . . . . . . . . . . . . . . . . . . 132 7.2.3 Performance Overhead . . . . . . . . . . . . . . . . . . . . . . . . 133 7.3 Analysis Module Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 7.3.1 FSM Performance Tests . . . . . . . . . . . . . . . . . . . . . . . . 135 7.3.2 Applying FSM to Identify Design Flaws . . . . . . . . . . . . . . 140 7.3.3 Data Reduction Results . . . . . . . . . . . . . . . . . . . . . . . . 142 7.4 PAD Tool Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 7.4.1 Antipatterns Detected in the Duke’s Bank Application . . . . . 144 7.4.2 Antipatterns Detected in the IBM Workplace Application - Beta Version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 7.5 Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 7.5.1 Overview of Contributions and Evaluation Criteria . . . . . . . 151 7.5.2 Validation of Contributions . . . . . . . . . . . . . . . . . . . . . 152 8 Conclusions 153 8.1 Thesis Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 8.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 8.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 References 171 A Antipattern Rule Library 172 A.1 Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 A.1.1 Category1: Antipatterns Across or Within Run-Time Paths . . . 172 A.1.2 Category2: Inter-Component Relationship Antipatterns . . . . . 174 A.1.3 Category3: Antipatterns Related to Component Communica- tion Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 A.1.4 Category4: Data Tracking Antipatterns . . . . . . . . . . . . . . 177 A.1.5 Category5: Pooling Antipatterns . . . . . . . . . . . . . . . . . . 177 A.1.6 Category6: Intra-Component Antipatterns . . . . . . . . . . . . 178 A.1.7 Adding Rules to The Rule Library . . . . . . . . . . . . . . . . . 178 A.2 Jess User Defined Functions provided by the PAD Tool . . . . . . . . . . 178 A.3 Configuration Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 B JEEM and FSM Implementation Source Code 180 iii
  5. 5. LIST OF FIGURES 1.1 Typical Enterprise Architecture . . . . . . . . . . . . . . . . . . . . . . 3 2.1 Typical JEE Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 Client Invoking an EJB . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.3 Example EJB Deployment Descriptor . . . . . . . . . . . . . . . . . . . 20 2.4 Stateless Session Bean Lifecycle . . . . . . . . . . . . . . . . . . . . . . 21 2.5 Patterns, Antipatterns and their Relationship . . . . . . . . . . . . . . 27 2.6 JVMPI Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.7 The KDD Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.1 PAD Tool Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.2 Run-time Design Meta Model . . . . . . . . . . . . . . . . . . . . . . . 46 3.3 Hierarchy of Antipatterns . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.1 Dynamic Call Trace (a) and Corresponding Dynamic Call Tree (b) . . 58 4.2 Example Run-Time Path . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.3 Run-Time Path Data Structure . . . . . . . . . . . . . . . . . . . . . . . 60 4.4 A Run-Time Path’s PathNode Data Structure . . . . . . . . . . . . . . 60 4.5 COMPAS Probe Insertion Process . . . . . . . . . . . . . . . . . . . . . 65 4.6 COMPAS JEEM Architecture . . . . . . . . . . . . . . . . . . . . . . . . 66 4.7 Intercepting Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.8 Remote Method Invocation . . . . . . . . . . . . . . . . . . . . . . . . . 74 4.9 The Sample Bean’s Home Interface . . . . . . . . . . . . . . . . . . . . 75 4.10 A Wrapper for the Sample Bean’s Home Interface . . . . . . . . . . . . 75 4.11 A Sample Bean Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.12 Run-Time Path with Tracked Object, as a Sequence Diagram . . . . . . 78 4.13 JEEManagedObject . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 4.14 JDBCStats, JDBCConnectionStats, JDBCConnectionPoolStats . . . . . 81 5.1 Run-time Design Meta Model from Chapter 3 . . . . . . . . . . . . . . 91 5.2 Example Run-Time Path (a), Example Deployment Descriptors (b), Extract of Component Data Structure (c) and Data Extracted to Pop- ulate Data Structure (d) . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5.3 Example Run-Time Path (a), Extract of the TrackedObject Data Struc- ture (b) and Information Extracted to Populate the TrackedObject Data Structure (c) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 iv
  6. 6. 5.4 Example Run-Time Path (a), Extract of the RunTimeContainerSer- vice Data Structure (b), and Information Extracted to Populate the RunTimeContainerService Data Structure (c) . . . . . . . . . . . . . . . 95 5.5 Class Diagram Showing Components Relationships . . . . . . . . . . 98 5.6 Example Transaction with Different Support Counting Approaches . 100 5.7 Hidden Elements in Transaction and Corresponding Support Counts 101 6.1 Rule to Detect Simultaneous Interfaces Antipattern . . . . . . . . . . . 115 6.2 Rule to Detect Needless Session Antipattern . . . . . . . . . . . . . . . 116 6.3 Rule to Detect Bulky or Unusual Levels of Database Communication 117 7.1 AccountList Run-Time Path and UML sequence diagram . . . . . . . 128 7.2 Diagram Showing Components in Duke’s Bank . . . . . . . . . . . . . 129 7.3 Diagram Showing Components in PlantsByWebsphere . . . . . . . . . 131 7.4 Portability Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7.5 Performance Overhead Test Results . . . . . . . . . . . . . . . . . . . . 133 7.6 Test Results on K2 10 2 Database . . . . . . . . . . . . . . . . . . . . . 136 7.7 Test Results on K2 100 2 Database . . . . . . . . . . . . . . . . . . . . . 138 7.8 Test Results on Sun Database . . . . . . . . . . . . . . . . . . . . . . . . 139 7.9 Test Results on IBM Database . . . . . . . . . . . . . . . . . . . . . . . 140 7.10 Class Diagram of a Modified Version of Duke’s Bank with Commu- nication Patterns Highlighted . . . . . . . . . . . . . . . . . . . . . . . 143 A.1 The Transactions-A-Plenty Rule . . . . . . . . . . . . . . . . . . . . . . 172 A.2 The Conversational-Baggage Rule . . . . . . . . . . . . . . . . . . . . . 173 A.3 The Sessions-A-Plenty Rule . . . . . . . . . . . . . . . . . . . . . . . . . 173 A.4 The Needless-Session Rule . . . . . . . . . . . . . . . . . . . . . . . . . 174 A.5 The Remote-Calls-Locally Rule . . . . . . . . . . . . . . . . . . . . . . . 175 A.6 The Accessing-Entities-Directly Rule . . . . . . . . . . . . . . . . . . . 175 A.7 The Bloated-Session Rule . . . . . . . . . . . . . . . . . . . . . . . . . . 175 A.8 The Unusual-or-Bulky-Session-Entity-Communication Rule . . . . . . 176 A.9 The Fine-Grained-Remote-Calls Rule . . . . . . . . . . . . . . . . . . . 176 A.10 The Unused-Data-Object Rule . . . . . . . . . . . . . . . . . . . . . . . 177 A.11 The Incorrect-Pool-Size Rule . . . . . . . . . . . . . . . . . . . . . . . . 177 A.12 The Local-and-Remote-Intefaces-Simultaneously Rule . . . . . . . . . 178 v
  7. 7. List of Acronyms ADL: Architecture Description Language AOP: Aspect Oriented Programming API: Application Programming Interface AST: Abstract Syntax Tree BCI: Byte Code Instrumentation CCM: CORBA Component Model CPI: COMPAS Probe Insertion DTO: Data Transfer Object EIS: Enterprise Information Systems EJB: Enterprise Java Beans ERP: Enterprise Resource Planning FCA: Formal Concept Analysis FIM: Frequent Itemset Mining FSM: Frequent Sequence Mining HTML: HyperText Markup Language HTTP: HyperText Transfer Protocol J2EE: Java 2 Enterprise Edition J2SE: Java 2 Standard Edition JDBC: Java Database Connectivity JEE: Java Enterprise Edition JMS: Java Message Service JMX: Java Management Extensions JNDI: Java Naming and Directory Interface JSP: Java Server Pages JSR: Java Service Request JVM: Java Virtual Machine JVMPI: Java Virtual Machine Profiler Interface JVMTI: Java Virtual Machine Tools Interface KDD: Knowledge Discovery in Databases LQN: Layered Queuing Networks MTS: Microsoft Transaction Server OCL: Object Constraint Language OS: Operating System PAD: Performance Antipattern Detection PMI: Performance Monitoring Infrastructure POJO: Plain Old Java Object QN: Queuing Networks RDBMS: Relational Database Management Systems RMI: Remote Method Invocation RML Relational Manipulation Language SOA: Service Oriented Architecture SPA: Stochastic Process Algebras SPN: Stochastic Petri Nets SQL: Structured Query Language UML: Unified Modelling Language XML: Extensible Markup Language vi
  8. 8. ABSTRACT Enterprise applications are becoming increasingly complex. In recent times they have moved away from monolithic architectures to more distributed systems made up of a collection of heterogonous servers. Such servers generally host numerous soft- ware components that interact to service client requests. Component based enterprise frameworks (e.g. JEE or CCM) have been extensively adopted for building such ap- plications. Enterprise technologies provide a range of reusable services that can assist developers building these systems. Consequently developers no longer need to spend time developing the underlying infrastructure of such applications, and can instead concentrate their efforts on functional requirements. Poor performance design choices, however, are common in enterprise applications and have been well documented in the form of software antipatterns. Design mistakes generally result from the fact that these multi-tier, distributed systems are extremely complex and often developers do not have a complete understanding of the entire ap- plication. As a result developers can be oblivious to the performance implications of their design decisions. Current performance testing tools fail to address this lack of system understanding. Most merely profile the running system and present large vol- umes of data to the tool user. Consequently developers can find it extremely difficult to identify design issues in their applications. Fixing serious design level performance problems late in development is expensive and can not be achieved through ”code op- timizations”. In fact, often performance requirements can only be met by modifying the design of the application which can lead to major project delays and increased costs. This thesis presents an approach for the automatic detection of performance design and deployment antipatterns in enterprise applications built using component based frameworks. Our main aim is to take the onus away from developers having to sift through large volumes of data, in search of performance bottlenecks in their applica- tions. Instead we automate this process. Our approach works by automatically recon- structing the run-time design of the system using advanced monitoring and analysis techniques. Well known (predefined) performance design and deployment antipat- terns that exist in the reconstructed design are automatically detected. Results of ap- plying our technique to two enterprise applications are presented. The main contributions of this thesis are (a) an approach for automatic detection of performance design and deployment antipatterns in component based enterprise frameworks, (b) a non-intrusive, portable, end-to-end run-time path tracing approach for JEE and (c) the advanced analysis of run-time paths using frequent sequence mining to automatically identify interesting communication patterns between com- ponents. vii
  9. 9. Dedicated to my parents, Tommy and Kay.
  10. 10. ACKNOWLEDGEMENTS Firstly I would like to thank my supervisor, John Murphy for giving me the oppor- tunity to pursue this research, and also for all his help, encouragement and good humour along the way. I would also like to thank Liam Murphy who was always available for dialog and who has effectively acted as a second supervisor over the years. I would like to thank Andrew Lee for initially suggesting the notion of ”detect- ing antipatterns”, when I was back searching for research ideas. Also thanks to Peter Hughes for his input and feedback during the early days of my work. Next, I would like to thank my colleagues in Dublin City University, where this jour- ney first began. In particular, I would like to thank Ada Diaconescu, Mircea Trofin and Adrian Mos, from whom I learned so much during my initial two years as a re- searcher, for being fun colleagues, for always being available to bounce ideas off (even now that you have all unfortunately left Dublin) and for teaching me the important basics of the Romanian language. Furthermore I would like to thank Adrian for as- sisting me in extending some of his research ideas and for inviting me to INRIA for interesting discussions on my work. I would also like to thank Colm Devine, Adrian Fernandes and Cathal Furey (three of the four horsemen) for their engaging lunch time discussions (in the early days) on the human anatomy and other such topics. Thanks also to my DCU/UCD colleagues, Dave ”the brickie” McGuinness, Jenny Mc- Manis, Gabriel Muntean, Christina Muntean, Christina Thorpe, Alex Ufimtsev, Oc- tavian Ciuhandu, Lucian Patcas, Olga Ormand, Jimmy Noonan, Hamid Nafaa, Petr Hnetynka, Sean Murphy, John Fitzpatrick (for allowing me to introduce him to Las Ve- gas), John Bergin, Omar Ashagi (for teaching me basic Arabic) and Philip McGovern for all being fun colleagues and to all those who I have had the pleasure of working with over the years. Also thanks again to Sean Murphy for taking my questions over the years and especially for his help with some of the mathematical aspects of my research. Furthermore, thanks to all those in IBM who helped me during my work and granted me access to their environments. Thanks especially to Pat O’Sullivan and Simon Piz- zoli for their help, interest and invaluable feedback on my research. A special thanks to Claire Breslin for her endless support and patience, and for re- minding me about the more important things in life. Finally I would like to thank my parents, Tommy and Kay and brother, Tom, for their constant encouragement. I would especially like to thank my parents, to whom this work is dedicated. Without their unwavering love and support this work would not have been possible. ix
  11. 11. LIST OF PUBLICATIONS Trevor Parsons, John Murphy. Detecting Performance Antipatterns in Component Based Enterprise Systems. Accepted for publication in the Journal of Object Technology. Trevor Parsons, John Murphy, Patrick O’Sullivan, Applying Frequent Sequence Mining to Identify Design Flaws in Enterprise Software Systems. In Proceedings of the 5th Inter- national Conference on Machine Learning and Data Mining, Leipzig, Germany, July 18-20, 2007. Trevor Parsons, John Murphy, Simon Pizzoli, Patrick O’Sullivan, Adrian Mos, Reverse Engineering Distributed Enterprise Applications to Identify Common Design Flaws. Pre- sented at the Software Engineering Tools For Tomorrow (SWEFT) 2006 Conference, T.J. Watson, New York, Oct 17 - 19, 2006. Liang Chen, Patrick O’Sullivan, Laurence P. Bergman, Vitorrio Castelli, Eric Labadie, Peter Sohn, Trevor Parsons. Problem Determination in Large Enterprise Systems. Pre- sented at the Software Engineering Tools For Tomorrow (SWEFT) 2006 conference, T.J. Watson, New York, Oct 17 - 19, 2006. (Abstract only available) Trevor Parsons, Adrian Mos, John Murphy. Non-Intrusive End to End Run-time Path Tracing for J2EE Systems. IEE Proceedings Software, August 2006 Trevor Parsons, John Murphy. The 2nd International Middleware Doctoral Symposium: Detecting Performance Antipatterns in Component-Based Enterprise Systems. IEEE Dis- tributed Systems Online, vol. 7, no. 3, March, 2006 Trevor Parsons. A Framework for Detecting Performance Design and Deployment Antipat- terns in Component Based Enterprise Systems. In Proceedings 2nd International Middle- ware Doctoral Symposium, ACM Press, art. no. 7, Grenoble, France, 2005 Trevor Parsons. A Framework for Detecting, Assessing and Visualizing Performance An- tipatterns in Component Based Systems. First Place at ACM SIGPLAN Student Research Competition Graduate Division, In OOPSLA’04: Companion to the 19th annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and ap- plications, pages 316-317, Vancouver, BC, Canada, 2004. x
  12. 12. Trevor Parsons, John Murphy. A Framework for Automatically Detecting and Assessing Performance Antipatterns in Component Based Systems using Run-Time Analysis. The 9th International Workshop on Component Oriented Programming, part of the 18th Eu- ropean Conference on Object Oriented Programming. Oslo, Norway, June 2004. Trevor Parsons, John Murphy. Data Mining for Performance Antipatterns in Component Based Systems Using Run-Time and Static Analysis. Transactions on Automatic Control and Control Science, Vol. 49 (63), No. 3, pp. 113-118 - ISSN 1224-600X, May 2004. xi
  13. 13. CHAPTER ONE Introduction Main Points • Performance is a major issue during the development of enterprise applications. • System complexity leads to a lack of understanding and consequently poor de- sign decisions are commonly made by developers. • Poor system design is often responsible for a badly performing system. • Current performance testing tools do not address performance design issues and are limited. • There are a large number of well known design issues for enterprise systems. • Antipatterns document well known design issues and their corresponding solu- tion. • Thesis Contributions: – An approach for the automatic detection of performance design and de- ployment antipatterns in systems built on component based enterprise frameworks. – A portable, low overhead, non-intrusive, end-to-end run time path tracer for distributed JEE systems. – A technique for the identification of interesting communication patterns in a collection of run-time paths. 1
  14. 14. 1.1 Background and Motivation In the past software developers had to be extremely careful when developing their ap- plications as resources were often scarce and the management of such scare resources was a complex issue. Modern advances in software technologies, however, have al- lowed for developers to concentrate less on issues such as performance and resource management, and instead developers have been able to spend more time developing the functionality of their applications. An example of this can be seen in modern lan- guages (Java1, C#2) that provide garbage collection facilities, freeing developers from the task of having to manage memory, which had typically been a complex and time consuming exercise. Freeing developers from having to worry about what is happen- ing ”under the hood” allows them to concentrate more of their efforts on developing the functionality of a system. This is even more obvious with enterprise level com- ponent frameworks (e.g. JEE3 or CCM4) whereby the framework can be expected to handle complex underlying issues such as security, persistence, performance and con- currency to name but a few. Again the idea is to allow developers to concentrate on the application functionality such that the time to market is reduced. A downside of this advance in software technologies is that developers become less familiar with the mechanics of the underlying system, and as a result, can make decisions during development that have an adverse effect on the system. Performance is a major issue for developers building large scale multi user enterprise applications. In fact recent surveys have shown that a high percentage of enterprise projects fail to meet their performance requirements on time or within budget 5 6. This leads to project delays and higher development costs, and results from the fact that de- velopers often do not have a complete understanding of the overall system behaviour. Figure 1.1 shows a typical enterprise application made up of a number of different physically distributed servers. Each server can in turn be made up of a large number of software components that interact to service different client requests. Understand- ing the run-time behaviour of such systems can be a difficult task and consequently it is common that developers are unaware of the performance implications of their design decisions. Current development and testing tools fail to address this issue of understanding en- terprise system behaviour. For example most of today’s performance tools merely profile the running system and present performance metrics to the tool user. The 1The Java Technology, Sun Microsystems, http://java.sun.com/ 2The C# language, Microsoft, http://msdn2.microsoft.com/en-us/vcsharp/aa336809.aspx 3Java Enterprise Edition, Sun Microsystems, http://java.sun.com/javaee/ 4The CORBA Component Model specification, The Object Management Group, http://www.omg.org/technology/documents/formal/components.htm 5Ptak, Noel and Associates, ”The State of J2EE Application Management: Analysis of 2005 Benchmark Survey”, http://www.ptaknoelassociates.com/members/J2EEBenchmarkSurvey2005.pdf, 6Jasmine Noel, ”J2EE Lessons Learned ”, SoftwareMag.com, The Software IT Journal, January, 2006. http://www.softwaremag.com/L.cfm?doc=2006-01/2006-01j2ee 2
  15. 15. Figure 1.1: Typical Enterprise Architecture volume of data produced when profiling even a single user system can be extremely large. When profiling multi-user enterprise applications, where a typical load may be in the order of thousands, the amount of data produced can be truly overwhelming. Often developers are required to sift through and correlate this information looking for bottlenecks in their systems. Furthermore, even when developers find issues in their applications using these tools, it is common that they are unsure as to how to go about rectifying the issue. There is a clear need for more advanced performance tools that not only profile the running system, but that also analyse the data produced to identify potential issues in the application. While there has been research in the area of debugging tools (e.g. [95] [145] [55] [14] [47]) which allow for automatic low-level bug detection, often it is the case that low-level optimizations or fixes will not be enough to enhance the system efficiency such that performance requirements are met. In many situations an overhaul of the system design is required. There are a large number of well known design mistakes that are consistently made by developers building these systems. Such issues have been documented in the form of software design antipatterns [36]. Similar to software design patterns, which doc- ument best practices in software development, software antipatterns document com- mon mistakes made by developers when building software systems. However, as well as documenting the mistake, antipatterns also document the corresponding solution to the problem. Thus not only can they be used to identify issues in software systems, but they can also be used to rectify these issues by applying the solution provided. A more complete and detailed definition of software patterns and antipatterns is given in sections 2.6 and 2.7 respectively. 3
  16. 16. 1.2 Thesis Overview In light of the limitations of current performance tools and of the benefits of software antipatterns, we have developed an approach to automatically identify performance design and deployment antipatterns in systems built on enterprise component-based frameworks. This approach takes the burden away from developers, of having to sift through large volumes of monitoring data in search of design flaws, and instead automates this process. Well known performance design flaws can be identified au- tomatically. Identified issues are presented with related contextual information and a corresponding solution to the problem such that the problem can be easily addressed. The approach works by reconstructing the run-time design of the application under test. The reconstructed design can be subsequently checked for well known pre- defined antipatterns. From a high level this is achieved (a) by monitoring the run- ning system to collect information required for antipattern detection, (b) by perform- ing analysis on the monitoring data to summarise it and to identify relationships and patterns in the data that might suggest potential design flaws, (c) by representing the analysed data in a design model of the system and (d) by loading the design into a rule engine such that antipatterns (pre-defined as rules) can be detected. The approach has been realised in the Performance Antipattern Detection (PAD) tool. The tool has been designed for the Java Enterprise Edition (JEE) technology. The remainder of the thesis is structured as follows: Chapter 2 gives background information on related technologies and related work. Chapter 3 gives a more de- tailed overview of our approach, discusses our research methodology and outlines a number of criteria that we use to validate our work. In this chapter we also give an overview of software design antipatterns, with particular focus on performance an- tipatterns. Chapter 4 outlines the different monitoring approaches that are required for antipattern detection in a component based enterprise system, and how they can be performed in a portable manner. Chapter 5 details a number of advanced anal- ysis techniques that are applied to identify interesting relationships and patterns in the run-time data. In particular it presents an approach for identifying frequent or re- source intensive communication patterns between components using techniques from the field of data mining. In this chapter we also show how the data collected from en- terprise systems under load can be reduced and summarised. Chapter 6 shows how a rule engine approach can be used to identify antipatterns in the reconstructed run- time design of the system. In this chapter we also categorise JEE performance design and deployment antipatterns into groups based on the data required to detect them. Chapter 7 presents different sets of results from a range of tests that we have per- formed to validate our research. Finally chapter 8 gives our conclusions and ideas on future work in this area. 4
  17. 17. 1.3 Thesis Contributions and Statement The first major contribution of this thesis is an approach for the automatic detection of design and deployment antipatterns in systems built using component based enter- prise frameworks [125] [129] [130] [131] [132] [133]. This approach builds on current performance tools by performing analysis on the data collected (i.e. run-time infor- mation and component meta-data). The analysis reconstructs the system design and identifies performance design flaws within it. The approach has been implemented for the JEE technology in the form of the PAD tool, however it could potentially be applied to other component based enterprise frameworks (e.g. CCM). This solution has been successfully applied to both a sample and a real JEE application and has a number of key advantages. Firstly, it reduces and makes sense of the data collected by many of today’s perfor- mance profilers. This work makes use of statistical analysis and data mining tech- niques to summarise the data collected and to find patterns of interest that might suggest performance problems. Thus, it takes the onus away from developers who currently have to carry out this tedious task manually. Secondly, while most of today’s performance tools tend to focus on identifying low level hotspots and programming errors (e.g. memory leaks, deadlocks), this work focuses on analysing the system from a performance design perspective. Since design has such a significant effect on performance [43] it is essential that work is carried out in this area. Thirdly, unlike with many of today’s performance tools, problems identified are anno- tated with descriptions of the issue detected, as well as a solution that can be applied to alleviate the problem. This approach of identifying and presenting antipatterns to developers helps them understand the mistakes that have been made, and the under- lying reason as to why performance was affected. Developers can learn from using our tool, and thus it may be less likely that the same mistakes are made in the future. This approach also allows developers to easily rectify the situation by applying the solution provided. In fact, the antipatterns presented provide a high level language that developers and management alike can use to discuss such problems when they occur. The second major contribution of this work is a portable, low overhead, non- intrusive, end-to-end run-time path tracer for JEE systems [128]. This is the first com- pletely portable approach for collecting end-to-end run-time paths across all server side tiers of a distributed JEE application. It is non-intrusive insofar as it does not re- quire any modifications to the application or middleware source code. The monitor- ing approach instead makes use of standard JEE mechanisms to intercept calls made to the instrumented components. A run-time path [44] contains the control flow (i.e. the ordered sequence of methods called required to service a user request), resources 5
  18. 18. and performance characteristics associated with servicing a request. Such information is utilised to detect antipatterns by our PAD tool. By analysing run-time paths one can easily see how system resources are being used, how the different components in the system interact and how user requests traverse through the different tiers that make up the system. In fact these paths can also be used for Object Tracking, i.e. to monitor particular objects’ life cycles across the different user requests. In this work we show how run-time paths can be used to manually and automatically reverse engineer a JEE application. We also show how the reconstructed design can be used for either manual or automatic identification of performance design flaws. For example, the PAD tool makes use of run-time paths to identify the (run-time) component relation- ships, communication patterns and object usage patterns in a JEE system. Results are given for this monitoring approach which show that it produces a low overhead on the instrumented system and that it can be applied in a portable manner. The third and final major contribution of this work is a technique for the identifica- tion of interesting communication patterns in a collection of run-time paths [126]. More precisely, we have applied a data mining technique, Frequent Sequence Mining (FSM) to identify sequences of interest (e.g. frequently repeating method sequences and resource intensive loops) across a transactional database of run-time paths by us- ing alternative support counting techniques. In this work we also discuss scalability problems (in terms of both the algorithm runtime and the amount of data produced) related to applying FSM to run-time paths and give solutions to these issues. We show how the sequences identified, can be used to highlight design flaws in enter- prise applications, that lead to poor system performance. The PAD tool makes use of this analysis technique to identify interesting component communication patterns in a JEE system that may indicate the presence of particular antipatterns. Following the above contributions the fundamental thesis of this work can be stated as follows: Performance design and deployment antipatterns can be automatically detected in component based enterprise systems by analysing run-time data and component meta-data. 1.4 Key Assumptions and Scope The work in this thesis is focused on component based systems as defined in section 2.2. As such, it is highly likely that the source code of the application to be analysed is not available in its entirety, as components may have been developed by third par- ties. Thus we assume source code is not available for analysis of the system. For such systems bytecode analysis may also be problematic due to security restrictions or li- censing constraints. Instead, we assume that a running implementation of the system to be analysed is available such that dynamic data can be collected and utilised for analysis. 6
  19. 19. We also assume that a realistic testing scenario is available which reflects how the sys- tem will be used in production. We do not address the issue of how such testing sce- narios can be obtained in this work, however, research in this area already exists. For example Weyuker and Voklos have outlined an approach for the development of per- formance test cases [176]. In this literature five typical steps required to develop per- formance test cases are outlined. Alternatively, Ho et al. [92] propose an evolutionary approach to performance test design based on their Performance Requirements Evolu- tion Model. The authors claim that more precise and realistic performance tests can be created incrementally during the development process through customer communi- cation or performance model solving. In addition, agile development techniques such as test-driven development [25] promote the design of test cases before developers begin to code. Recently work has been presented which discusses how performance tests can be incorporated into the test driven development process [96] allowing for early availability of performance testing scenarios. Our approach is applicable to applications built on component based enterprise frameworks. However our research has thus far only been applied to synchronous components, and has not, for example, been applied to message driven beans which are asynchronous components in the JEE technology. Thus, our scope is limited to synchronous applications. Our plans for future work outline how this approach could potentially be applied to asynchronous components (see section 8.3). 7
  20. 20. CHAPTER TWO Background In this chapter we introduce related research areas and technologies. We begin by discussing the research area of performance engineering. Next we give an overview of component based software, giving a definition for a software component and dis- cussing component frameworks. We also give background information on the Java Enterprise Edition technology which is the enterprise framework that our work has been applied to. We focus specifically on the Enterprise Java Bean component tech- nology and give details in this area related to our research. We present an overview of the state of the art in performance tools discussing techniques for load generation and performance profiling. We particularly focus on performance profiling tools for the Java technology. Furthermore we give an overview of software architecture, software patterns, and software antipatterns. An overview of research in the area of reverse engineering is also presented. In this section we outline why previous approaches are less suitable for distributed component based applications. The current state of the art of research in the area of software pattern/antipattern detection is also discussed. Finally we introduce the area of knowledge discovery in databases, and data mining techniques relevant in this work. 8
  21. 21. Main Points • Current performance analysis techniques, e.g., modelling, are inaccurate and time consuming when applied to component based enterprise systems. Thus, in industry performance analysis is usually deferred until performance testing begins using the currently available performance testing tools. • Component technologies such as EJB are increasingly being adopted to provide for flexible, manageable and reusable solutions for complex software systems. However poor system performance is common in these systems. • System Architecture focuses on issues related to the overall system structure and is said to be non-local, whereas software design focuses on local issues . • Enterprise design plays a significant role in a system’s overall performance. Best practices in design have been well documented in the form of design patterns. • Well known design issues consistently occur in enterprise applications and have been well documented, along with their corresponding solution, in the form of design antipatterns. • Performance testing tools for complex-multi user enterprise applications are lim- ited and merely profile the running system, presenting vast amounts of data to the tool user. There is a clear need from more advanced tools, that take the onus away from the developer of having to sift through this data, and that automati- cally analyse the data produced. • Detailed documentation is generally not available for enterprise applications. Thus, it can be difficult for developers to comprehend the overall application design. • Current reverse engineering/design pattern detection/antipattern detection techniques are heavily based on static analysis and are unsuitable for compo- nent based systems • Data Mining techniques can be applied to extract knowledge from vast volumes of data. 9
  22. 22. 2.1 Performance of Software Systems The performance of a software system has been described as an indicator of how well the system meets its requirements for timeliness [154]. Smith and Williams [154] de- scribe timeliness as being measured in either response time or throughput, where re- sponse time is defined as the time required to respond to a request and throughput is defined as the number of requests that can be processed in some specific time in- terval. Furthermore they define two important dimensions to software performance timeliness, responsiveness and scalability. Responsiveness is defined as the ability of a system to meet its objectives for response time or throughput. The ability to con- tinue to meet these objectives, as the demand on the system increases, is defined as the systems scalability. The aim of performance engineering is to build systems that are both responsive and scalable. To date a vast amount of performance engineering research has focused on system analysis through performance models. Performance models are created based on sys- tem artifacts and various relevant estimates. Some of the most common performance model classes are Queuing Networks (QN) (or extensions, such as Extended QN and Layered QN), Stochastic Petri Nets (SPN), and Stochastic Process Algebras (SPA). Per- formance models can be evaluated using simulation techniques or analytical methods, in order to predict performance indices, such as throughput, response times, or re- source utilization. A comprehensive survey of modelling approaches for performance prediction is presented in [15]. However modelling today’s enterprise applications with a high degree of accuracy can be a difficult and time consuming task. This results from the fact that these sys- tems are often very large and complex and made up of black box components, the internals of which are generally unknown (e.g. application servers). Performance metrics required to populate performance models can thus not be easily obtained. For enterprise applications accurate performance metrics can often only be obtained through performance testing of a running system [58]. Recently Liu et al. [76] have used a combination of performance modelling and benchmarking techniques to al- low for population of performance models of enterprise applications. Their initial results give accurate performance prediction for the small sample systems. A draw- back of this approach is the lack of tool support which would allow for this technique to be easily reused. From our experiences with large software houses, it seems that performance modelling of enterprise applications is rarely performed. Most opt for performance testing using available performance testing tools. Work in the area of performance testing, however, has been very much lacking [176] and thus performance testing, especially in the case of large enterprise applications, can be a difficult task [76] [161]. This comes from the fact that today’s performance testing tools are quite limited, insofar as they generally focus on simply collecting 10
  23. 23. data from a running system (i.e. profiling) and presenting this data to the user. These tools tend to focus on low level programming bugs and do not address many of the issues that lead to poor system performance (e.g. design issues). 2.2 Component Based Software 2.2.1 Software Components There are numerous definitions of what software components are or should be 1. To be specific, for the purpose of this thesis, we use Szyperski’s definition of a software com- ponent: ”A software component is a unit of composition with contractually specified interfaces and explicit context dependencies only. A software component can be de- ployed independently and is subject to composition by third parties” [157]. Defining a software component as a unit of composition simply means that the purpose of a com- ponent is to be composed with other components. A component based application is assembled from a set of collaborating components. To be able to compose components into applications, each component must provide one or more interfaces which provide a contract between the component and its environment. The interface clearly defines which services the component provides and therefore defines its responsibility. Usu- ally, software depends on a specific context, such as available database connections or other system resources being available. For example, other components that must be available for a specific component to collaborate with. In order to support com- posability of components, component dependencies need to be explicitly specified. A component can be independently deployed, i.e. it is self-contained and changes to the implementation of a component do not require changes to other components. Of course, this is only true as long as the component interface remains compatible. Fi- nally assemblers of component based applications, are not necessarily the developers of the different components. That is, components can be deployed by third parties and are intended to be reused. This definition of a software component leaves many details open, for example, how components interact, what language(s) can be used for their development, for what platform. Component frameworks further define the notion of a component, by de- tailing these issues. 2.2.2 Component Frameworks The key goal of component technology is independent deployment and assembly of components. Component frameworks are the most important step for achieving this 1Beyond Objects column of Software Development magazine, articles by Bertrand Meyer and Clemens Szyperski, archived at http://www.ddj.com/ 11
  24. 24. aim [157]. They support components conforming to certain standards (or component models) and allow instances of these components to be plugged into the component framework. The component framework establishes environmental conditions for the component instances and regulates the interaction between component instances. A key contribution of component frameworks is partial enforcement of architectural principles. By forcing component instances to perform certain tasks via mechanisms under control of a component framework the component framework can enforce its policies on the component instances. This approach helps prevent a number of classes of subtle errors that can otherwise occur. There are numerous component frameworks that exist today. Examples include EJB, CCM, SOFA [140] and Fractal [37]. Each framework contains its own component model, i.e. a set of features that components satisfy. Component models generally con- form to either flat component models or hierarchical models. Flat component models (e.g. EJB, CCM) define only primitive components whereby indivisible entities are di- rectly implemented in a programming language. Hierarchical models (SOFA, Fractal) also define composite components which are created via nesting of other components. The research in this thesis aims at solving issues related to flat component models. In particular we focus on EJB. EJB is part of a wider enterprise framework (Java Enter- prise Edition) for building enterprise level applications (see section 2.3). EJB has been selected since it is a well established technology that is currently used in industry to develop enterprise applications. There is also a body of work detailing best practices and bad practices for this technology (see sections 2.6 and 2.7). On the other hand the hierarchical component models have mainly been used by the research community and best practices in these areas are less well defined. EJB is considered a contextual composition framework. Contextual composition frameworks allow components to specify boundary conditions describing properties that the runtime context must meet [166]. Composition performed by such frame- works is based on the creation of contexts and placement of component instances in the appropriate contexts. For example, a component framework for transactional computation can be formed by supporting transactional attribution of components (for example ”this component’s instances need to be used in a new transaction”) and transactional enactment at the boundary of contexts. This approach can be used to cre- ate frameworks for any other properties such as security, load balancing, management etc. Any component instance in a context can potentially be accessed from outside its con- text. This context however, gets an opportunity to intercept all messages crossing the context boundaries. Intercepting instances (e.g. objects) inside a context remains in- visible to instances both external and internal to this context. Current technology support for contextual composition includes Microsoft Transac- 12
  25. 25. tion Server (MTS)2, EJB containers, and CCM containers. We introduce the EJB tech- nology in the following sections. The run-time services we discuss in section 5.3 are created as a result of contextual composition. 2.3 The Java Enterprise Edition The Java Enterprise Edition (JEE) is a component technology which defines a standard (or a set of standards) for developing multi-tier enterprise applications. JEE, formerly the Java 2 Enterprise Edition (J2EE), is an enterprise component framework for the Java technology. The specification promotes a multi-tiered distributed architecture for enterprise applications. Figure 2.1 shows a typical JEE architecture consisting of 4 main tiers: a client tier, a presentation or web tier, a business tier and an enterprise information systems tier. JEE specifies different component types for implementing the various enterprise application tiers. Naturally, clients reside in the client tier and can be in the form of stand-alone Java applications or web browsers. In the following subsections we detail each of the server-side tiers and give details on the components that they can consist of. Figure 2.1: Typical JEE Architecture 2.3.1 Web Tier The JEE web tier provides a run-time environment (or container) for web components. JEE web components are either servlets or pages created using the Java Servlet Pages 2Microsoft Corporation. Microsoft Transaction Server Transactional Component Services. http://www.microsoft.com/com/wpaper/revguide.asp. 13
  26. 26. technology (JSPs) 3. Servlets are Java programming language classes that dynamically process requests and construct responses. They allow for a combination of static and dynamic content within the web pages. JSP pages are text-based documents that ex- ecute as servlets but allow a more natural approach to creating the static content as they integrate seamlessly in HTML pages. JSPs and Servlets execute in a web con- tainer and can be accessed by clients over HTTP (e.g. a web browser). The servlet filter technology is a standard JEE mechanism that can be applied to components in the web tier to implement common pre and post-processing logic. It is discussed in detail in section 4.2.5.1. 2.3.2 Business Tier Enterprise Java Beans (EJBs) 4 are the business tier components and are used to handle business logic. Business logic is logic that solves or meets the needs of a particular business domain such as banking, retail, or finance for example. EJBs run in an EJB container and often interact with a database in the EIS tier in order to process requests. Clients of the EJBs can be either web components or stand alone applications. EJB is the core of the JEE platform and provides a number of complex services such as messaging, security, transactionality and persistence. These services are provided by the EJB container to any EJB component that requests them. More details on the EJB component model are given in section 2.4 2.3.3 Enterprise Information System Tier Enterprise information systems provide the information infrastructure critical to the business processes of an enterprise. Examples of EISs include relational databases, enterprise resource planning (ERP) systems, mainframe transaction processing sys- tems, and legacy database systems. The JEE Connector architecture 5 defines a stan- dard architecture for connecting the JEE platform to heterogeneous EIS systems. For example a Java Database Connectivity (JDBC) Connector is a JEE Connector Archi- tecture compliant connector that facilitates integration of databases with JEE appli- cation servers. JDBC 6 is an API and specification to which application developers and database driver vendors must adhere. Relational Database Management Systems (RDBMS) vendors or third party vendors develop drivers which adhere to the JDBC specification. Application developers make use of such drivers to communicate with the vendors’ databases using the JDBC API. The main advantage of JDBC is that it allows for portability and avoids vendor lock-in. Since all drivers must adhere to the 3Java Servlet Technology, http://java.sun.com/products/servlet/ 4Enterprise Java Bean Technology, http://java.sun.com/products/ejb/docs.html 5Java Connector Architecture, http://java.sun.com/j2ee/connector/ 6Java Database Connectivity Architecture, http://java.sun.com/products/jdbc/ 14
  27. 27. same specification, application developers can replace the driver that they are using with another one without having to rewrite their application. 2.4 The Enterprise JavaBean Technology The Enterprise Java Beans architecture is a component architecture for the develop- ment and deployment of component based distributed applications. It is designed to simplify and reduce the costs of the development and management processes of large- scale, distributed applications. Applications built using this technology are capable of being scalable, transactional, and multi-user secure. EJB provides the distributed platform support and common services such as transactions, security, persistence and lifecycle management. EJB also defines a flexible component model which allows for components of different types that are suitable for specific tasks. Developers make use of the different component types to implement the application business logic. Subse- quently, EJBs are deployed and managed by EJB containers, as part of a JEE applica- tion server. EJB containers provide middleware services and manage the EJB lifecycle during runtime. These processes can be configured via XML documents, referred to as EJB deployment descriptors. Physically EJB consists of two things [148]: The specification7 which defines: • The distinct ”EJB Roles” that are assumed by the component architecture. • A component model • A set of contracts: component-platform and component-client A set of Java Interfaces: • Components and application servers must conform to these interfaces. This al- lows all conforming components to inter-operate. Also the application server can manage any components that conform to the interfaces. 2.4.1 The EJB Roles The EJB specification defines the following roles which are assumed by the component architecture: • Enterprise Bean Provider: The enterprise bean provider is typically an applica- tion domain expert. The bean provider develops the reusable enterprise beans that typically implement business tasks or business entities. 7The Enterprise Java Bean Specification version 2.0, http://java.sun.com/products/ejb/docs.html 15
  28. 28. • Application Assembler: The application Assembler combines enterprise beans into larger deployable application units. • Deployer: The Deployer takes the ejb-jar files produced by either the Bean Provider or the Application Assembler and deploys the enterprise beans con- tained in the ejb-jar files in a specific operational environment. The operational environment includes the EJB Server and Container. • EJB Service Provider and EJB Container Provider: The container supplies an EJB container (the application server). This is the runtime environment in which the beans live. The container provides middleware services to the beans and manages them. The server provider is the same as the container provider. Sun has not yet differentiated between them. • System Administrator: The system administrator is responsible for the upkeep and monitoring of the deployed system and may make use of runtime monitor- ing and management tools provided by the EJB server provider 2.4.2 The EJB Component Model EJB is built on top of object technology (Java). An EJB component consists of a busi- ness interface, an implementation class, a home interface and configuration settings (defined in an XML deployment descriptor). All of these, except for the deployment descriptor, are Java artifacts (i.e. classes or interfaces). The EJB implementation class contains the bean business logic written by the Enter- prise Bean Provider. The EJB implementation class is a Java object that conforms to a well defined interface and obeys certain rules. The interface it conforms to depends on the bean type. The rules are necessary in order for the bean to be able to run in a container. Access to the implementation class can be obtained using a the EJB home interface. The home interface defines methods for creating, destroying and finding EJBs (i.e. lifecycle methods). The home interface can either be local or remote. Local interfaces allow access from clients within the same JVM whereas remote interfaces allow for access from remote clients (e.g. on another JVM running on the same machine or on a JVM running on a physically distributed machine). In fact an EJB component can have both local and remote interfaces, however this is not recommended [161]. The bean implementation business methods are exposed through the business inter- face. Similar to the home interface the business interface can be exposed locally or remotely (or both). An EJB component also requires configuration settings for deployment. These settings are defined in a XML deployment descriptor. The information in the deployment 16
  29. 29. descriptor details the different container services that are required by the EJB. For example, a deployment descriptor can be used to declare how the container should perform lifecycle management, persistence, transaction control, and security services. EJB 2.0 defines three different kinds of enterprise beans, namely session beans, entity beans and message-driven beans. Session Beans: A session bean is an action bean that performs work for its client, shielding the client from complexity by executing business tasks inside the server. A session bean has only one client. When the client terminates the session appears to terminate and is no longer associated with the client. The life of a session bean spans the length of the session (or conversation) between the session and the client. Session beans are not persistent and typically they do not survive application server crashes, or machine crashes. They are in memory objects that live and die with their surround- ing environments. Session beans hold conversations with clients. A conversation is an interaction between a client and the bean. The two subtypes of session beans are state- ful session beans and stateless session beans. Each is used to model different types of conversations. Stateful Session Beans: A stateful session bean is a bean that is designed to service business processes that span multiple method requests or transactions. Stateful ses- sion beans retain state on behalf of an individual client. If a stateful session bean’s state is changed during a method invocation, that same state will be available to that same client upon the following invocation. Stateless Session Beans: A stateless session bean is a bean that holds conversations that span a single method call. They are stateless because they do not hold multi- method conversations with their clients. Except during method invocation, all in- stances of a stateless bean are equivalent, allowing the EJB container to assign an in- stance to any client. Because stateless session beans can support multiple clients, they can offer better scalability for applications that require a large number of clients. Typ- ically, an application requires fewer stateless session beans than stateful session beans to support the same number of clients. Entity Beans: Entity beans are persistent data components. Entity beans are enterprise beans that know how to persist themselves permanently to a durable storage (e.g. a database). They are physical, storable parts of an enterprise. Entity beans differ from session beans in a number of ways. They are persistent, and allow shared access. They have a unique identifier, enabling a client to identify a particular entity bean. Entity beans can also persist in relationships with other entity beans. Entity beans can be per- sisted in two ways, either using Bean-Managed Persistence, or Container-Managed Persistence. Container-Managed persistent beans are the simplest for the bean de- veloper to create. All logic for synchronizing the bean’s state with the database is handled automatically by the container. Thus, the beans do not contain any database access calls, and as a result the bean’s code is not tied to a specific persistent storage 17
  30. 30. mechanism (database). A Bean-managed persistent entity bean is an entity bean that must be persisted by hand. The component developer must write code to translate the in-memory fields into an underlying data store. Message-driven Beans: A message-driven bean is an enterprise bean that allows EJB applications to process messages asynchronously. They rely on the Java Message Ser- vice (JMS) technology 8. Message-driven beans act as JMS message listeners. The messages may be sent by any JEE component: an application client, another enter- prise bean, a Web component or by a JMS application or system that does not use JEE technology. A message-driven bean does not have component interfaces. The com- ponent interfaces are absent because the message-driven bean is not accessible via the Java RMI API; it responds only to asynchronous messages. One of the most impor- tant aspects of message-driven beans is that they can consume and process messages concurrently, because numerous instances of the MDB can execute concurrently in the container. This capability provides a significant advantage over traditional JMS clients. As discussed in section 1.4 we have not applied our research to asynchronous compo- nents. This is a direct result of the fact that our run-time path tracing approach (see section 4.2) can not currently be used to monitoring message driven beans. Our plans for future work suggest how this problem may be addressed (see section 8.3). 2.4.3 EJB Runtime An EJB component contains a bean implementation class, a business interface, a home interface and an XML deployment descriptor all of which are supplied by the bean provider. To integrate the component into the container environment the container automatically generates ”glue-code” that allows for the component to implicitly make use of the container services. In fact, enterprise beans are not fully-fledged remote ob- jects. When a client accesses an EJB, the client never invokes the methods directly on the actual bean instance. Instead, the invocation is intercepted by the EJB container and delegated to the bean instance. The interception is performed by the EJBObject. The EJBObject is generated by the container (either during deployment or at run-time) and provides the enterprise bean with networking capabilities and container services such as transactions and security. The EJBObject replicates and exposes every business method that the bean exposes. It is generated from the business interface supplied by the bean provider. Similarly an EJBHome object is generated from the home interface. The EJBHome object exposes the same methods as this interface and acts as a factory object for EJBObjects. That is, the EJBHome Object is responsible for creating and de- stroying EJBObjects. In order to understand how the various component constituents work together we give an example of the various steps that are performed by a client 8Java Message Service, from Sun Microsystems: http://java.sun.com/products/jms/ 18
  31. 31. and by the container when a bean is invoked. To create an instance of an EJB a client must first obtain an instance of a EJBHome ob- ject, generated by the container. The EJBHome object is bound to the component name and available at run-time through the system’s naming directory, accessed through the Java Naming and Directory Interface (JNDI)9. Thus to invoke an EJB a client performs the following steps (see figure 2.2): (1) It first obtains a reference to the EJBHome object that the container has generated. The reference is looked up in the system-naming directory via JNDI. The client will call the required construction method on the home object. 2) The EJBHome object instructs the container to create a new instance or retrieve an existing instance of the component, and returns it to the client. The actual Java Object returned is an instance of the container-generated EJBObject class that corresponds to the bean’s component interface. 3) The client invokes the business method on the returned EJBObject, transparently, through the component interface. The EJBObject performs the required container ser- vices (specified in the XML deployment descriptor) and calls the corresponding busi- ness method on the bean’s implementation object, instance of the bean provider’s bean class. Figure 2.2: Client Invoking an EJB 9Java Naming and Directory Interface (JNDI), http://java.sun.com/products/jndi/ 19
  32. 32. Figure 2.3: Example EJB Deployment Descriptor 2.4.4 Deployment Settings As shown in figure 2.2 the container generated EJBObject intercepts and delegates all calls to the bean implementation. The EJBObject supplies the bean implementation with any required services as specified in the deployment descriptor. Figure 2.3 shows an extract from a deployment descriptor which specifies transactional attributes for a beans methods. Such settings can have a major impact on the system performance 10 and should be carefully considered. The container is also responsible for the management of the bean’s life cycle events. 10Performance Tuning EJB Applications - Part I by Mihir Kulkarni, February 2005, http://dev2dev.bea.com/pub/a/2005/02/perf tune session beans.html 20
  33. 33. Figure 2.4: Stateless Session Bean Lifecycle 21
  34. 34. The management of an EJB’s lifecycle is a complex process and differs from bean type to bean type. Factors which influence the bean lifecycle management include the load in the system and the container configuration settings 11. Figure 2.4 illustrates the lifecycle of a stateless session bean. When the container starts the application it creates a pool of bean instances. The pool size can be determined by setting the value in the configuration settings. If a bean instance is required by a client, an instance is assigned from the bean pool. If no in- stances are available the container can create more instances until the pool has reached its maximum size (which is also configurable). If the bean pool has already reached its maximum size and there are still no instances available the client will be put in a queue until an instance becomes available. The pool configuration settings can have a major impact on the system performance and should be tuned according to the ex- pected load on the system. The lifecycle of a stateful session bean and of an entity bean are similar but slightly more complicated than that of the stateless session (since they can both be passivated). More details on these lifecycles can be found in the literature [148]. It is sufficient to say for the purposes of this thesis that the configurations settings in relation to EJB lifecycles can have a major impact on the system performance and need to be carefully considered. 2.5 Software Architecture A large number of definitions exist for the term software architecture 12. One of most cited definitions is by Bass et al. [21] and states that: ”The software architecture of a program or computing system is the structure or struc- tures of the system, which comprise software elements, the externally visible proper- ties of those elements, and the relationships among them.” Bass et al. [21] also outline a number of implications of this definition: Firstly, a soft- ware architecture is essentially an abstraction since it embodies information about the relationship between elements, and externally visible properties that are exposed to other elements, but it omits internal element information or information that does not pertain to the elements’ interactions. Secondly, the definition makes it evident that systems can and do consist of more than one structure. Thirdly, it is implied by the definition that every software system has an architecture since every system can be shown to compose of elements and relationships between them. Fourthly, the exter- nal behaviour of each element is part of the architecture and finally the definition is indifferent as to whether the architecture for a system is a good or bad one. 11PreciseJava, http://www.precisejava.com/ 12Software Engineering Institute, Carnegie Mellon, list of software architecture definitions, http://www.sei.cmu.edu/architecture/definitions.html 22
  35. 35. A software architecture is important for a number of reasons. Firstly it becomes a vehicle for communication among the system’s stakeholders [21]. System stakehold- ers are those concerned with the system (e.g. users, customers, software developers, management etc.). A software architecture is a common abstraction of the system and can serve as a lingua franca, i.e. an intermediate language that all stakeholders can use to discuss various aspects of the system. The different stakeholders of the system are often concerned with different system characteristics. An architecture pro- vides a common language in which these different concerns can be expressed. Since stakeholders can be interested in different system characteristics it is important for the architecture to provide different views [51] that let them consider the architecture from different perspectives. For example, a functional view might contain an abstrac- tion of the different system functions and their relations whereas a code view may give an abstraction of the code in terms of objects or classes (or higher level subsys- tems or modules) and their relationships. Different stakeholders make use of different views to analyse the architecture according to their needs. Typical views include a functional view, a concurrency view, a code view, a physical view etc [52]. Kruchten [104] introduced the 4+1 view model to describe a software architecture using five concurrent views. Views are essentially a mechanism that allow for the separation of concerns within the architecture allowing for the analysis of the architecture from dif- ferent perspectives. Architecture description languages (ADLs) [50] can be utilised to describe a software architecture. There have been many attempts to design such lan- guages. However while some have been employed in real word projects none have been widely adopted [21]. The literature [115] provides a comparison of ADLS. Another important reason for system architecture is that it creates a realisation of early design decisions and allows for system architects to analyse the suitability of these de- cisions in relation to the system requirements (e.g. performance, security, flexibility) [21]. These early design decisions manifested in the system architecture can not only impact the quality attributes of the system but can also place constraints on the actual system implementation i.e. some technologies may be more suitable for particular architectures. The initial architecture can even have an impact on the organisational structure of the team (or teams) building the application [21]. One of the earliest de- sign decisions is often to choose a suitable architectural style. An architectural style defines a vocabulary of components (e.g. clients, servers, databases) and connector types (e.g. procedure calls, database protocols), and a set of constraints on how they can be combined [152]. Architectural styles are found repeatedly in practice to address similar sets of demands. Finally, software architectures are also reusable assets that can be applied repeatedly to other systems exhibiting similar requirements [21]. 23
  36. 36. 2.6 Software Patterns The current use of the term pattern in software engineering is derived from work by Christopher Alexander [168] in the field of contemporary architecture. Alexander’s notion of a pattern was adopted by a number of software engineering researchers [26] [71] and became popular in this field mainly after work published by Gamma et. al [72]. Gabriel 13 gives the following definition of a pattern: ”Each pattern is a three-part rule, which expresses a relation between a certain context, a certain system of forces which occurs repeatedly in that context, and a certain software configuration which allows these forces to resolve themselves.” This definition is consistent with Alexan- der’s original definition [168] which states that a ”pattern is a three part rule which expresses a relation between a certain context a problem and a solution.” Alexander expands his definition to say that a problem relates to a certain system of forces which occurs repeatedly in a context and that the problem solution can be considered as a certain configuration which allows these forces to resolve themselves. While patterns have been documented for a number of different domains (such as patterns for con- temporary architecture [168] or organisational patterns [54]) we are mainly interested in software patterns. Software patterns are usually documented according to a pattern template. Common templates for describing patterns include the Alexandrian form [168] and the GoF form [72]. A given template contains a number of elements that describe the pattern e.g. name, problem, context, forces, solution, examples, resulting context, rationale, related patterns and known uses 14. Buschmann et al. [42] document a number of properties or benefits of patterns. While they focus on patterns for software architecture many of the properties hold for soft- ware patterns in general e.g.: • A pattern addresses a recurring problem that arises in specific situations, and presents a solution to it [42]. • Patterns document existing, well proven experience. That is, they document solutions learned through experience and avoid the need for less experienced developers to ”reinvent the wheel” time and time again [72]. • Patterns provide a common vocabulary and understanding for design principles [72]. Similar, to the way a software architecture can serve as a vehicle for com- munication (see section 2.5 above), pattern names can become part of a design language and can act as a lingua franca facilitating discussion of design issues and their solutions [42]. • Patterns support the construction of software with defined properties [42]. Pat- terns assist developers in meeting both functional and non-functional require- 13The Hillside Group, Pattern Definitions, http://www.hillside.net/patterns/definition.html 14Patterns and Software: Essential Concepts and Terminology, by Brad Appleton, http://www.cmcrossroads.com/bradapp/docs/patterns-intro.html 24
  37. 37. ments since they can provide a skeleton of functional behaviour while at the same time they can explicitly address non-functional requirements e.g. reuse- ability, maintainability etc. Software patterns can be documented at various levels of abstraction. For example, Buschmann et al. [42] discuss patterns at three different levels of abstraction, i.e., architectural patterns, design patterns and coding patterns or idioms. Architectural level patterns are concerned with system structure. They describe predefined sets of subsystems, specify their responsibilities and include rules and guidelines for organ- ising the relationships between them. Design patterns on the other hand tend to be at the level of objects and classes (or micro-architectures) and are used for refining sub- systems or components of a software system. In the literature [72] they are defined as ”descriptions of communicating objects and classes that are customized to solve a general design problem in a particular context.” Eden and Kazman have also distin- guished between architecture and design stating that architecture is concerned with non-local issues whereas design is concerned with local issues [62]. Coding patterns or idioms are lower level patterns specific to a programming language [53]. Since their introduction in the area of object oriented software development [72], pat- terns have been documented for a range of systems and technologies 15. For examples pattern catalogs exits in areas such as enterprise systems [67] [93], embedded-systems [142], telecommunication systems [171] to name but a few. Many technology specific patterns (or idioms) also exist (e.g. for Java [80], Ajax [110] and Microsoft technolo- gies 16). In fact pattern catalogs even exist with particular quality attributes in mind (e.g. security [150], performance [154]). Alur et al. [7] provide a catalog of patterns for the JEE technology which document best practices for the design and implementation of JEE applications. Other literature in this area also exists [113] 17. The design of a JEE application plays a major role in the overall system performance. For example, it has previously been shown how the system design can influence a JEE system’s scal- ability [43]. In fact it is well known and recent reports 18 19 also indicate that poor system design is a major reason as to why JEE systems often fail to meet performance requirements. Another reason why poor software design is particularly undesirable is that unlike with lower level software bugs, for example, poor software design can be particularly difficult to rectify late in development and as such can lead to major project delays. Software design best practices documented in the form of patterns can be used to help avoid design issues when developing JEE applications. 15Handbook of Software Architecture, http://www.booch.com/architecture/index.jsp 16Enterprise Solution Patterns Using Microsoft .NET, http://msdn2.microsoft.com/en- us/library/ms998469.aspx 17The Server Side Pattern Repository, http://www.theserverside.com/patterns/index.tss, 18Ptak, Noel and Associates, ”The State of J2EE Application Management: Analysis of 2005 Benchmark Survey”, http://www.ptaknoelassociates.com/members/J2EEBenchmarkSurvey2005.pdf, 19Jasmine Noel, ”J2EE Lessons Learned ”, SoftwareMag.com, The Software IT Journal, January, 2006. http://www.softwaremag.com/L.cfm?doc=2006-01/2006-01j2ee 25
  38. 38. 2.7 Software Antipatterns Antipatterns, first suggested by Koenig [101], have been defined by Brown et al. [36] as: ”a literary form that describes a commonly occurring solution to a problem that generates decidedly negative consequences.” The authors [36] also state that when documented properly an ”antipattern describes a general form, the primary causes which led to the general form; symptoms describing how to recognize the general form; the consequences of the general form; and a refactored solution describing how to change the antipattern into a healthier situation.” Software design antipatterns thus provide the opportunity for developers to learn from past experiences. They docu- ment software design mistakes that tend to consistently reoccur. However, as well as documenting the mistake, antipatterns also document the corresponding solution. Thus they allow developers to identify design issues in their system, and to rectify these issues with the corresponding solution provided in the antipattern description. Antipatterns are complementary to software patterns and often show situations where patterns are misused. In fact, as technologies evolve often patterns can become stale [60], i.e., what was once a best practice can in some instances become a bad practice. Examples from the JEE technology include the caching with a Service Locator pat- tern [7], which was recommended for J2EE 1.2 but is not recommended for J2EE 1.3 20. Another example is the Composite Entity pattern [7] which has become obsolete since EJB version 2.x [61]. Figure 2.5 [36] shows the relationship between patterns and antipatterns. Software design antipatterns, like software design patterns, have been documented at a number of different levels. For example Brown et al. [36] introduced a num- ber of technology independent object oriented development antipatterns (as well as higher level architectural and management antipatterns). Technology specific antipat- terns have also been documented (e.g. Java [160], J2EE [161] [61]). Antipatterns for systems built on service oriented architectures (SOA) have also been recently doc- umented 21. As with software design patterns, some antipattern catalogs focus on particular software quality attributes only. For example Smith and Williams have pre- sented a number of performance related antipatterns [154], while Kis has presented antipatterns focusing on security [99]. For the purpose of this thesis we focus mainly on performance related antipatterns for enterprise systems. In particular we focus on performance antipatterns related to design and deployment for JEE applications. Similar to software design antipatterns, Fowler and Beck introduced the notion of code smells [68]. Code smells are lower level symptoms of problems at the code level. 20B. Woolf. IBM WebSphere Developer Technical Journal: Eliminate caching in service locator implementations in J2EE 1.3., http://www-128.ibm.com/developerworks/websphere/techjournal/0410 woolf/0410 woolf.html, October 2004 21SOA antipatterns,Jenny Ang, Luba Cherbakov and Mamdouh Ibrahim, November 2005, http://www-128.ibm.com/developerworks/webservices/library/ws-antipatterns/ 26
  39. 39. Figure 2.5: Patterns, Antipatterns and their Relationship They may not necessarily be a problem but often their presence in the code indicate that problems exist. Catalogs of code smells are available in the literature 22 23. 2.8 Performance Tools When design issues lead to poor performance, developers require testing tools to iden- tify why the system is performing poorly. Developers use performance testing tools to try to understand system behaviour and to discover how their system makes use of the system resources. Application level performance testing tools fall into two main categories, i.e. workload generation tools and performance profilers. 2.8.1 Workload Generation In order to evaluate the performance characteristics of an application under devel- opment a realistic workload is required to mimic how the system would be utilised by clients in a production environment. To achieve this a synthetic workload can be automatically generated using a workload generator. Workload generators fall into two main categories, traced based approaches and analytical approaches [17]. Traced based approaches make use of server log files to characterise the workload of an ap- plication, whereas analytical approaches are based on mathematical models which are 22A Taxonomy of Code Smells, http://www.soberit.hut.fi/mmantyla/BadCodeSmellsTaxonomy.htm 23Smells within Classes, http://wiki.java.net/bin/view/People/SmellsToRefactorings 27
  40. 40. usually based on statistical methods [124]. There are advantages and disadvantages associated with both approaches. For example, traced based approaches are consid- ered relatively easy to implement and are based on activity from a known system. However disadvantages relate to the fact that this approach treats the workload as a black box and as such insight into the workload characteristics can be difficult to ob- tain. Also it can be difficult to modify the workload to simulate future or alternative conditions. Furthermore during development realistic logs may not be available to base the traced based workload generation upon. Analytical approaches on the other hand can be used to create synthetic workloads and do not suffer from the drawbacks outlined above. However they can be more difficult to construct as an understanding of the characteristics of the expected workload is required. The most commonly used workload generators are the analytically based commercial profilers e.g. Apache’s JMeter 24 or Mercury Loadrunner25. The literature [136] gives a representative subset of the workload generators currently available in the open literature. 2.8.2 Profiling Tools Next we explain what we mean by the term profiling and discuss the different ways and levels of granularity in which profiling information can be collected (see section 2.8.2.1). We also give an overview of the different categories of profilers that are avail- able for the Java technology (see section 2.8.2.2). Profiling [169], is the ability to monitor and trace events that occur during run time. This includes the ability to track the cost of these events, as well as the ability to at- tribute the cost of the events to specific parts of the program. A profiler, for example, may obtain information about what part of the program consumes the most CPU time, or about the parts of the program which allocate the most amount of memory. Perfor- mance profilers can be used in conjunction with load generators to monitor a running system and obtain the required information for performance analysis. Profilers are often described as either exact profilers or sampling based profilers [135] [30]. Ex- act profiling also referred to as full profiling [64] or full instrumentation 26, captures all events of a given type that are produced during program execution (e.g. method invocations). Sampling based profilers on the other hand select a part of the entire event population with the aim of determining the characteristics of the whole pro- gram. Sampling usually involves selecting a subset of events for profiling based on certain criteria (e.g. hot paths) or time intervals [64]. Exact profiling has the advantage of being more precise than sampling but carries a higher performance overhead. 24Jakarta Apache JMeter http://jakarta.apache.org/jmeter/index.html. 25Mercury Loadrunner. http://mercury.com 26http://profiler.netbeans.org/docs/help/5.5/custom instrumetation 28
  41. 41. 2.8.2.1 Recording Information Regardless of the profiling approach however, information must be recorded by the profiling tool. Performance metrics can be recorded at different levels of granular- ity. At the lowest level hardware counters can be utilised to obtain performance met- rics from the underlying hardware on which the program executes [8] [10] [156] [88]. Hardware counters can be utilised to record events, such as instructions executed, cy- cles executed, pipeline stalls, cache misses, etc. One of the main advantages of using hardware counters is that an application can be profiled without the need to modify or instrument it. Also the overhead associated with using hardware counters for profil- ing is generally quite low [10]. A disadvantage of hardware counters is that they rely on platform specific features and thus they are generally not portable across different hardware. Another issue with this type of profiling is that the information may be too low level for higher level program analysis. Information such as virtual memory management requests, and signals caused by segmentation violations can be obtained at the operating system (OS) level [88]. OS level information can be recorded by sys- tem level tools 27 or libraries 28. In situations where hardware counters or OS level information is unavailable, or the information they produce is undesirable, informa- tion can be obtained at a higher level. For today’s enterprise Java applications such information can be recorded at a number of different levels i.e. at the JVM level, the middleware level or the application level. JVM level information is generally recorded by either instrumenting the JVM or by using an agent-based approach that requests notification of events from the virtual machine. The former very often requires access to the JVM source code such that it can be modified to record the information required [12] [33]. A disadvantage of this approach is that it requires an understanding of the complex JVM internals. Also this approach generally ties the user to a particular JVM and is thus not portable. One of the main advantages of instrumenting the JVM is that access to JVM level information is not restricted as with the agent-based approaches. Agent based approaches have been made popular through standard interfaces that allow for a profiler agent to request performance related information from a running JVM. The Java Virtual Machine Profiler Interface (JVMPI) [169] is an example of such an interface (see figure 2.6). The JVMPI is a two-way function call interface between the JVM and an in-process profiler agent. The profiler agent is responsible for commu- nication between the JVM and the profiler front end. The profiler agent can register with the JVM to be notified when particular events occur and upon notification can call back into the JVM to obtain additional information. For example a notification may be received when a method is entered (or exited) and a call back may be made to 27Performance Monitoring Tools for Linux, David Gavin Jan, 1998, http://www.linuxjournal.com/article/2396 28Windows Management Instrumentation, http://www.microsoft.com/whdc/system/pnppwr/wmi/default.mspx 29
  42. 42. Figure 2.6: JVMPI Architecture obtain the current stack trace at this point. The main advantage of standard interfaces is that they are implemented by different JVM vendors. While the JVMPI was an ex- perimental interface for Java 1.2, it was implemented by most JVM vendors and effec- tively became standard. Thus profilers built using JVMPI are portable across different JVM implementations. A disadvantage of standard interfaces is that they are fixed interfaces and, as such, can only enable predefined types of profiling [135] or event notifications. Another major issue with JVMPI in particular was that when using a JVMPI agent the JVM could not run at full speed and was required to run in a debug- ging mode. As such this profiling approach was not generally suitable for production systems. Another major draw back of the JVMPI approach was that notifications could not be tailored to profile selectively. If, for example, the profiler agent requested to be notified on method entry events, all method entry events would be reported to the agent. This lead to high overhead performance tools. The Java Virtual Machine Tools Interface (JVMTI) 29 will replace JVMPI in Java 1.6 (the JVMPI is currently available in Java 1.5). While at an architectural level the JVMTI looks similar to the JVMPI (i.e. it also consists of call back functions and a profiler agent) it is quite different and im- proves upon many of the limitations of JVMPI. Firstly, it allows for the JVM to run at full speed and does not require it to run in debug mode. It also promotes the use of bytecode instrumentation for many of the event based capabilities of the JVMPI. Using bytecode instrumentation one can be more selective when profiling the appli- cation and can instrument only the parts of the application that require analysis. This avoids the ”all or nothing” approach of the JVMPI and thus reduces the profiler over- head. In fact the JVMTI allows for dynamic bytecode instrumentation 30, which means that the application can be instrumented as it runs. An issue that remains, however, with both JVMPI and JVMTI is that they are native interfaces and while the profil- ing agents (which must be written in native code) are portable across different JVMs 29The Java Virtual Machine Tools Interface, http://java.sun.com/j2se/1.5.0/docs/guide/jvmti/jvmti.html 30Ian Formanek and Gregg Sporar, Dynamic Bytecode Instrumentation A new way to profile Java applications, December 15, 2005, http://www.ddj.com/dept/java/184406433 30
  43. 43. they are not portable across different platforms. The java.lang.instrument interface 31 is another standard interface (as of Java 1.5) which allows for the interception of the JVM classloading process through a non-native agent. Since the agent is non-native it is portable across different platforms. Java.lang.instrument allows for the agent to monitor the classloading process and to instrument the classes such that they can call back into the agent libraries. Recording performance information for Java applications at the middleware level can also be achieved through standard mechanisms. Java Management Extensions (JMX) technology 32 is a standard technology that allows for the management of Java re- sources through so called MBeans. MBeans, also known as managed beans, are Java objects that are used to represent and manage JMX resources. A JMX resource can be any application, device or Java object. In order to manage an MBean it must be reg- istered with a JMX agent. JMX agents directly control registered MBeans and make them available to remote management applications. While the JMX technology can potentially be used to manage a wide range of different resources it has been heavily used, in particular, to manage the resources of JEE application servers. In fact, ac- cording to the JEE Management specification (Java Service Request 77 33) application servers are required to expose this data through the JMX technology. Profilers built us- ing JMX can collect data on the state of the different system resources (e.g. object pool sizes, thread queues, database connectivity information) and because JMX is standard they are portable across the different application server implementations. Non standard hooks or mechanisms have also been used to collect information at the middleware level. Often middleware vendors provide these mechanisms to enhance the capabilities of their products. For example, IBM provides non standard features for the Websphere application server in the form of the Performance Monitoring In- frastructure (PMI). PMI is available for the Websphere application server and allows for the collection of performance information on the server resources. The informa- tion can be exposed to performance profiling tools through a number of different in- terfaces 34. The main issue with non-standard features, that allow for the collection of performance information, is that they are not portable across different vendors’ im- plementations of the middleware, and thus can result in vendor lock in. Where the information required is not available through standard or non-standard features the middleware itself can be the subject of instrumentation. This can be achieved by manually modifying the source code if it is available [45] [105]. However 31J2SE 5.0 in a Nutshell, Calvin Austin, May 2004,http://java.sun.com/developer/technicalArticles/releases/j2se15, 32Java Management Extensions Technology,http://java.sun.com/javase/technologies/core/mntr- mgmt/javamanagement/ 33Java Service Request 77, J2EE management, http://www.jcp.org/en/jsr/detail?id=77 34Srini Rangaswamy, Ruth Willenborg and Wenjian Qiao, IBM WebSphere De- veloper Technical Journal: Writing a Performance Monitoring Tool Using Web- Sphere Application Server’s Performance Monitoring Infrastructure API, 13 Feb 2002, http://www.ibm.com/developerworks/websphere/techjournal/0202 rangaswamy/rangaswamy.html 31

×