SlideShare a Scribd company logo
1 of 5
Download to read offline
The Component Balancer:
Optimization of Component-Based Applications
Jim Fontana, Viraj Byakod
Unisys Corporation
Jim.Fontana@unisys.com, Viraj.Byakod@unisys.com
Abstract
The Component Balancer establishes and maintains
response time goals for selected business logic contained
in methods in component-based applications without the
need to modify application code. The premise for this
optimization is that meaningful business logic in
component methods can be driven to response time goals
by controlling the calling rate of other methods, based on
analysis of the application workload. For example, if two
methods are accessing the same database, one method’s
response time can be improved if the other method is
selected to withstand longer delays. Other benefits of the
Component Balancer include business level control and
optimization and, for heavily loaded systems, smoother
resource utilization and increased scalability.
1. Introduction
The Component Balancer establishes and maintains
response time goals on business logic contained in
component methods. As the load on the system varies, the
Component Balancer attempts to keep methods with
response time goals at a specified target performance level
at the expense of lower priority methods.
Additional benefits of the Component Balancer are
business-level control and optimization and, for
overloaded systems, smoother resource utilization and
increased scalability. The case studies below expand on
these benefits.
The Component Balancer has two phases of operation: the
analysis phase provides optimization recommendations
and the optimization phase, using these recommendations,
self-tunes under variable load. Additionally, the
Component Balancer works across machines, applications
and application servers and requires no changes to the
application.
A key capability is the ability to wrap component methods
with “conditioning” code. The inserted conditioning code
is called a conditioner. Supported components, such as
COM+, are called managed types. Conditioners and
managed types are plug-ins to a framework.
Conditioners can be added and removed while an
application is running. No component source code is
needed and no recompilation or reconfiguration of the
components is required. Multiple conditioners can be
applied to a method. In addition to executing pre and post
processing code, conditioners have access to the
parameters of a method, can capture and throw
exceptions, can bypass the method execution and can
abort lower priority conditioners. The overhead is
minimal - 35 microseconds per method call plus the cost
of conditioners on a 700 Mhz system.
2. Analysis and Optimization Details
The details of the analysis and optimization algorithms are
given in this section. One of the design goals was to
expose as few “moving parts” as possible to the end user –
keeping the parameters that control the optimization
process hidden and letting the self-tuning do the work. As
we gain a better understanding of the optimization
process, some of these parameters may be exposed for an
advanced user.
2.1 Analysis Detail
Analysis occurs over a time interval that is specified by
the user. The overall analysis time is divided into a
number of analysis periods. Each period is treated as an
independent set of statistics representing the system as
observed during that period. Analysis results during each
period are accumulated into an analysis report.
During each analysis period, pair wise calculations are
done on methods to determine which methods are affected
by other methods. For example, deploying analysis
conditioning for methods A, B and C result in pairs AB,
BA, AC, CA, BC and CB being analyzed.
A statistical Analysis of Variance is then applied for each
of the pairs to determine if the average response times
differ depending on the coincident execution of the
methods. For example, when pair AB is analyzed, the
average response time for all calls to method A during the
sample period when it ran “by itself” is calculated, that is,
SPECTS '04 363 ISBN: 1-56555-284-9
without method B running at the same time. Then the
average response time is calculated for all calls to method
A during the sample period when it ran overlapped with
method B. Method A must start after method B to be
considered as overlapped. This is based on the
assumption that the first method “in” will get any
contended resource.
The individual measurements of the response times are
used to calculate the F statistic. It is used to determine if
the mean values of two random variables are statistically
different, assuming a level of significance. The statistic is
useful because it applies a proven technique to determine
equality in an area of uncertainty. For example, if the two
random variables each have a high level of variance, the
hypothesis that their means are equal will be a function of
the difference between their statistical means and also the
amount of variance. We can accept or reject the
hypothesis that the random variables have equal means at
a given level of “significance.”
Therefore, given a level of significance we can then
determine if the average response time for method A
during the sampling interval is significantly changed when
running coincidentally with method B.
For COM+, methods that invoke other methods in a
calling sequence are identified and not considered. If A
calls B (or is in a calling sequence chain with A), B
affects the performance of A, but delaying B will not
improve the performance of A.
Since the method overlap algorithm is not symmetrical,
the F-value is also calculated for method B to see if it is
being affected by method A. The total number of calls to
A, the number of calls to A only and number of calls to A
while overlapped with B and the F-value for each method
pair is stored in the raw analysis report during each
period.
The final analysis report is created from the raw analysis
report as a result of two steps. The first step generates an
intermediate F statistic for each method pair and
accumulates the number of statistically significant method
pairs. From experimentation, it was found that a 67%
level of significance could yield improvements in the
system. Sample sizes for non-overlapped and overlapped
operations must each be greater than 10. Therefore, if the
average response times are statistically different, given a
significance level of 67%, we have found that there is a
possibility that the performance of one method can be
improved by delaying the other.
The level of significance has the single biggest affect on
the analysis results. Raising the value too high will
eliminate potentially valid results while giving more
credence to results that are significant. A value that is too
low will most likely introduce some results that are not
valid or result in little benefit.
The second step involves processing the method pairs.
Significant differences for methods affected by other
methods are considered positive and are assigned a score
of +1. Significant differences for methods affecting other
methods are considered negative and assigned a score of
-1. A number of independent time samples are used in
the analysis process. During the second step, the scores
from each of these time samples are accumulated. If the
resulting value is positive, the method is considered a
candidate for optimization. If the resulting value is
negative then the method is considered a candidate for
delay.
A simple COM+ application was developed to test and
verify analysis results. The application has one
component with four methods: getSpecialOffers, deposit,
withdraw and checkBalance. The application internals
were as follows: deposit directly affects checkBalance;
withdraw calls checkBalance and getSpecialOffers runs
without interfering with or being interfered by other
methods.
A 25-minute run was made on a laptop, with eight users
per method being called at a random rate between 0 and
900 milliseconds on a 700 Mhz system.
The results are given in Figure 1 below.
Figure 1. Analysis Report
The top three lines in the report show the final scores for
each method, while the bottom four lines show
intermediate scores for the various method pairs. As
expected, deposit (-16 = -12 –6 +1 +1) is shown as the
only potential delay, with checkBalance (11 = 12 –1) a
SPECTS '04 364 ISBN: 1-56555-284-9
good potential for optimization, followed by withdraw (5
= 6 –1). Relative numbers like this are expected as
withdraw is indirectly affected by deposit.
The deployment algorithm starts with the largest positive
and largest negative numbers and moves inward,
deploying optimization and delay conditioning, until the
list is exhausted or until an internal limit is reached on the
number of deployed conditioners.
A planned enhancement is to consider stopping if there is
a large relative gap in the totals of methods in the final
report – positive or negative. This would mean that the
highest impact conditioning is deployed. Another
improvement that is being considered is to take into
account methods that affect each other – that is, A affects
B and B affects A. In this situation, depending on the
distribution of the workload, their final numbers may
cancel each other out and make them indistinguishable
from methods that just have low final numbers.
Another point to note is that a method is not analyzed
against itself in the Component Balancer analysis. For
other purposes, this analysis could be useful.
2.2 Optimization Detail
Response time data from methods being optimized is
processed in real time as it comes in from the target
machines and placed in a working set. Optimization
calculations are done every few seconds for each group in
the working set.
A delay increment is calculated using fuzzy logic for each
method targeted for optimization in the group. The delay
increment can be positive or negative. The largest delay
increment for a method in the group is chosen and added
to the total delay. The total delay is not allowed to exceed
a maximum of 300 milliseconds or go below zero. The
delay conditioner is given a user selectable maximum
delay factor between 1 and 3 that it uses to calculate the
delay it actually uses on the method – so the maximum
any method can be delayed is 900 milliseconds.
The calculated delay is pushed out to the machines where
the delayed methods are located by writing a registry key
entry on those machines.
If no activity has occurred in a working set for 10 seconds
– that is, no conditioned methods were called - the delay is
set to zero. If one hour elapses with no entries for a
method, it is removed from the working set.
The fuzzy logic takes three parameters – the current
average response time, the current average calls per
second (cps) and the previous average calls per second.
The two average calls per second values are used to
determine load trending. All parameters are normalized
against the current response time and cps ranges before
they are passed to the fuzzy logic
The parameters are “fuzzified” – response time is
converted to fast, ok and slow and the calls per second are
converted to low, medium and high. The fuzzy logic
executes a set of rules given these parameters. For
example, if the response time is ok and the current cps are
high and the previous cps are low, then the fuzzy logic
outputs a positive delay increment. This means that, since
the load has increased, a response time increase is
anticipated and a bigger delay most likely is needed.
The delay value is then “defuzzified”, that is, converted
back to a number in the accepted range and returned to the
optimizer. The delay increment moves over its allowed
range based on how “ok” the response time is, how “high”
the current cps are and how “low” the previous cps was.
The target response time is the current minimum response
time if the maximum goal is chosen. No optimization is
done if no goal is chosen. For any goal in-between, the
target response time is in a relative range between the
current lower and upper response times. For example, if
the goal is 75% and the response time range is between 50
to 150 milliseconds, then the response time goal will be
75 milliseconds.
The upper response time value and lower cps value are
reset every minute. Doing this forces the optimization to
squeeze towards an optimum target response time. The
lower response time value ratchets down and indicates the
current best value that the method can run at. The upper
cps value ratchets up and indicates the current best
throughput for the method.
The cost of Component Balancer conditioning in COM+
is approximately 900 microseconds for each analyzed
method, 200 microseconds for each delayed method and
300 microseconds for each optimized method.
3. Case Studies
In all case studies, optimization was based on analysis
results.
3.1 ISD Benchmark
The ISD (Inventory, Sales and Deliver) benchmark,
developed at Unisys, is loosely based on TPC-C and is
used for MS SQL Server and Oracle database tuning.
This benchmark uses an in-process COM+ component
SPECTS '04 365 ISBN: 1-56555-284-9
with 7 methods to drive the database and comes with a
driver script. We ran the benchmark with a 2 processor
500 Mhz database server running SQL Server and a 2
processor 1 GHz middle-tier machine.
We did an analysis run and deployed the recommended
optimizations with a maximum optimization goal. Only
one method, ItemReport, was running at a high response
time, all other methods were well below 500 milliseconds.
With one method delayed, the ItemReport average
response time dropped from 5.38 to 2.06 seconds (161%)
with a 5% overall loss in throughput. With two methods
delayed, the ItemReport response time dropped to 1.06
seconds (408%) with an overall loss in throughput of
12%. The CPU utilization on the database server dropped
from 100% to 93% in the one method delayed case and to
85% in the two methods delayed case. This optimization
trades throughput for improved response time. The other,
non-delayed methods were also improved by the
optimization.
By allocating one more processor to the database server,
the ItemReport response time went back to its “normal”
sub-second value. The business benefit here is to take that
extra processor on a consolidated database server and use
it for some other purpose and utilize the Component
Balancer to manage ISD performance during peak loads.
3.2 Nile Bookstore
A case study was done using the Doculabs-developed Nile
benchmark, which simulates an online bookstore web
application.
Measurements were done using Microsoft’s Web
Application Stress tool to drive a 4 processor, 700 Mhz
system running IIS 5 using the Nile C++ COM+
components (2 components with 14 total methods) and a
back-end MS SQL Server database on a 4 processor, 548
Mhz system. We drove the Nile benchmark with scripts
that had a 4:1 ratio of browse versus browse and buy
transactions. The browsing workload was equally split
between short and long browses and the buying workload
was equally split between users that buy 1 book and buy 5
books. Results from a 200 user run are given in Figure 2
below.
Our optimization, being downstream from IIS, which is
the bottleneck in this configuration, does not significantly
change the performance profile - 16% improvement for
the logon response time and 6% more throughput.
However, buyers now have priority over browsers. This
shows that we can control the where time is spent in the
application in a meaningful fashion.
0
2
4
6
8
10
Short
Browse
Long
Browse
Buy 1
Book
Buy 5
Books
Average Time Per Script (sec)
Unoptimized
Optimized
Figure 2. Nile Response Times By Script
The normal configuration for the Nile would be a load-
balanced middle-tier server farm that puts a larger load
onto MS SQL Server. To simulate this, we moved the
backend database to an underpowered server (a one
processor 400 Mhz system) and injected extra load to
simulate a “medium” and “heavy” MS SQL Server load.
Now, response times are affected by MS SQL Server
performance and optimization can have a larger effect.
Results in Figure 3 below are for 100 users.
0
50
100
150
200
250
300
Medium Heavy
Database Load
Welcome Screen Avg Response Time (ms)
Unoptimized
Optimized
Optimized+
Figure 3. Optimization Results with Database Bottleneck
The “Optimize+” runs added a custom conditioner written
to cache the results of the GetSpecials method – which
returns the list of bookstore specials and is always
displayed on the welcome screen after logon. Both
optimizations improve the response times of the first two
screens the user sees without affecting the overall
throughput (logon screen response times are not shown).
SPECTS '04 366 ISBN: 1-56555-284-9
Just deploying the cache GetSpecials custom conditioner
by itself resulted in a drop in the number of transactions
the database had to perform as seen in Figure 4 below.
Cache GetSpecials
conditioner deployed here.
Figure 4. Caching GetSpecials Custom Conditioner
We also did testing that overloaded the database. This
caused the database to thrash and erratic behavior to
appear. The Purchase method randomly spiked up to one
second and its call rate became quite variable. With
optimization, we were able to keep the Purchase response
time at 60 milliseconds with a steady call rate, drop the
CPU utilization on the database server from 100% to 93%
and increase the number of database transactions per
second by 26%. In this scenario, the Component Balancer
smoothed response times and scaled database
performance during an unexpected surge in load.
4. Conclusion
The Component Balancer improves response times of
high priority methods when under load. Additionally, it
can change and improve the business level profile of an
application. The Component Balancer is available as part
of the Application Sentinel for Resource Management
product from Unisys Corporation.
Acknowledgments
I would like to acknowledge the following people:
Russ Cole and Praveen Nagarajan – initial prototyping
and feasibility
Ron Neubauer and Paul Koerber – automation
Mark Tadman and Peter Partch – COM+ conditioning
Jack Chang and Joyce Liu – Weblogic conditioning
Bob Walker – Nile investigation and set up
Doug Tolbert – statistical consultation
Michael Salsburg – detailed review and editorial advice
References
[1] Brown, Keith, “Building a Lightweight COM Interception
Framework, Part I: The Universal Delegator”, Microsoft
Systems Journal, vol. 14, pp. 17-29, January 1999.
[2] Brown, Keith, “Building a Lightweight COM Interception
Framework, Part II: The Guts of the UD”, Microsoft Systems
Journal, vol. 14, pp. 49-59, February 1999.
[3] Robert E. Filman, “Applying Aspect-Oriented Programming
to Intelligent Synthesis”, Workshop on Aspects and Dimensions
of Concerns, 14th European Conference on Object-Oriented
Programming, Cannes, France, June 2000.
[4] Ribler, R. L., H. Simitci, D. A. Reed, “The Autopilot
Performance-Directed Adaptive Control System”, Future
Generation Computer Systems, special issue of Performance
Data Mining, Volume 18, Number 1, 2001, pp. 175-187.
[5] Zona Research, “The Economic Impacts of Unacceptable
Web-Site Download Speeds”, Zona Market Bulletin, Redwood
City, California, Zona Research, 1999.
SPECTS '04 367 ISBN: 1-56555-284-9

More Related Content

What's hot

DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...
DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...
DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...
ijics
 
ITX3000 - Expert Evaluation Method done
ITX3000 - Expert Evaluation Method doneITX3000 - Expert Evaluation Method done
ITX3000 - Expert Evaluation Method done
Amir Mohamed Ali
 
gonzales_wesley_ENGR3406_FINAL_PROJECT
gonzales_wesley_ENGR3406_FINAL_PROJECTgonzales_wesley_ENGR3406_FINAL_PROJECT
gonzales_wesley_ENGR3406_FINAL_PROJECT
Wesley Gonzales
 

What's hot (18)

Building a new CTL model checker using Web Services
Building a new CTL model checker using Web ServicesBuilding a new CTL model checker using Web Services
Building a new CTL model checker using Web Services
 
COMPARATIVE ANALYSIS OF CONVENTIONAL PID CONTROLLER AND FUZZY CONTROLLER WIT...
COMPARATIVE  ANALYSIS OF CONVENTIONAL PID CONTROLLER AND FUZZY CONTROLLER WIT...COMPARATIVE  ANALYSIS OF CONVENTIONAL PID CONTROLLER AND FUZZY CONTROLLER WIT...
COMPARATIVE ANALYSIS OF CONVENTIONAL PID CONTROLLER AND FUZZY CONTROLLER WIT...
 
2.1.1 PROBLEM SOLVING & DESIGN
2.1.1 PROBLEM SOLVING & DESIGN2.1.1 PROBLEM SOLVING & DESIGN
2.1.1 PROBLEM SOLVING & DESIGN
 
DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...
DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...
DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...
 
Dynamic Matrix Control (DMC) on jacket tank heater - Rishikesh Bagwe
Dynamic Matrix Control (DMC) on jacket tank heater - Rishikesh BagweDynamic Matrix Control (DMC) on jacket tank heater - Rishikesh Bagwe
Dynamic Matrix Control (DMC) on jacket tank heater - Rishikesh Bagwe
 
Estimation
EstimationEstimation
Estimation
 
Bj4301341344
Bj4301341344Bj4301341344
Bj4301341344
 
HPLC - Peak integration for chromatography
HPLC - Peak integration for chromatographyHPLC - Peak integration for chromatography
HPLC - Peak integration for chromatography
 
ITX3000 - Expert Evaluation Method done
ITX3000 - Expert Evaluation Method doneITX3000 - Expert Evaluation Method done
ITX3000 - Expert Evaluation Method done
 
An Analysis on Query Optimization in Distributed Database
An Analysis on Query Optimization in Distributed DatabaseAn Analysis on Query Optimization in Distributed Database
An Analysis on Query Optimization in Distributed Database
 
Test Case Design and Technique
Test Case Design and TechniqueTest Case Design and Technique
Test Case Design and Technique
 
Test Case Design and Technique
Test Case Design and TechniqueTest Case Design and Technique
Test Case Design and Technique
 
Test Case Design and Technique
Test Case Design and TechniqueTest Case Design and Technique
Test Case Design and Technique
 
DETECTION OF CONTROL FLOW ERRORS IN PARALLEL PROGRAMS AT COMPILE TIME
DETECTION OF CONTROL FLOW ERRORS IN PARALLEL PROGRAMS AT COMPILE TIME DETECTION OF CONTROL FLOW ERRORS IN PARALLEL PROGRAMS AT COMPILE TIME
DETECTION OF CONTROL FLOW ERRORS IN PARALLEL PROGRAMS AT COMPILE TIME
 
PERFORMANCE COMPARISON OF TWO CONTROLLERS ON A NONLINEAR SYSTEM
PERFORMANCE COMPARISON OF TWO CONTROLLERS ON A NONLINEAR SYSTEMPERFORMANCE COMPARISON OF TWO CONTROLLERS ON A NONLINEAR SYSTEM
PERFORMANCE COMPARISON OF TWO CONTROLLERS ON A NONLINEAR SYSTEM
 
gonzales_wesley_ENGR3406_FINAL_PROJECT
gonzales_wesley_ENGR3406_FINAL_PROJECTgonzales_wesley_ENGR3406_FINAL_PROJECT
gonzales_wesley_ENGR3406_FINAL_PROJECT
 
Design and Implementation of Model Reference Adaptive Controller using Coeffi...
Design and Implementation of Model Reference Adaptive Controller using Coeffi...Design and Implementation of Model Reference Adaptive Controller using Coeffi...
Design and Implementation of Model Reference Adaptive Controller using Coeffi...
 
Model Based Software Timing Analysis Using Sequence Diagram for Commercial Ap...
Model Based Software Timing Analysis Using Sequence Diagram for Commercial Ap...Model Based Software Timing Analysis Using Sequence Diagram for Commercial Ap...
Model Based Software Timing Analysis Using Sequence Diagram for Commercial Ap...
 

Viewers also liked

Viewers also liked (9)

Social issues research
Social issues researchSocial issues research
Social issues research
 
Waluga library orientation
Waluga library orientationWaluga library orientation
Waluga library orientation
 
Cont litcrit aalberg rev 3 8
Cont litcrit aalberg rev 3 8Cont litcrit aalberg rev 3 8
Cont litcrit aalberg rev 3 8
 
LOHS Social Issues -- catalog and databases
LOHS Social Issues -- catalog and databasesLOHS Social Issues -- catalog and databases
LOHS Social Issues -- catalog and databases
 
Lohs historical fiction
Lohs historical fictionLohs historical fiction
Lohs historical fiction
 
Lo jfamouspeople
Lo jfamouspeopleLo jfamouspeople
Lo jfamouspeople
 
Informative speeches 2nd sem rev3 13 12
Informative speeches 2nd sem rev3 13 12Informative speeches 2nd sem rev3 13 12
Informative speeches 2nd sem rev3 13 12
 
Unisys_AppDefender_Symantec_CFD_0_1_final
Unisys_AppDefender_Symantec_CFD_0_1_finalUnisys_AppDefender_Symantec_CFD_0_1_final
Unisys_AppDefender_Symantec_CFD_0_1_final
 
Meabe speeches 2nd sem rev22712
Meabe speeches 2nd sem rev22712Meabe speeches 2nd sem rev22712
Meabe speeches 2nd sem rev22712
 

Similar to Component Balancer (long)

New folderIMAG2318.jpgNew folderIMAG2319.jpgNew folder.docx
New folderIMAG2318.jpgNew folderIMAG2319.jpgNew folder.docxNew folderIMAG2318.jpgNew folderIMAG2319.jpgNew folder.docx
New folderIMAG2318.jpgNew folderIMAG2319.jpgNew folder.docx
henrymartin15260
 
DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...
DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...
DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...
ijcisjournal
 

Similar to Component Balancer (long) (20)

A NOVEL APPROACH FOR TEST CASEPRIORITIZATION
A NOVEL APPROACH FOR TEST CASEPRIORITIZATIONA NOVEL APPROACH FOR TEST CASEPRIORITIZATION
A NOVEL APPROACH FOR TEST CASEPRIORITIZATION
 
Configuration Navigation Analysis Model for Regression Test Case Prioritization
Configuration Navigation Analysis Model for Regression Test Case PrioritizationConfiguration Navigation Analysis Model for Regression Test Case Prioritization
Configuration Navigation Analysis Model for Regression Test Case Prioritization
 
Kitamura1992
Kitamura1992Kitamura1992
Kitamura1992
 
A model for run time software architecture adaptation
A model for run time software architecture adaptationA model for run time software architecture adaptation
A model for run time software architecture adaptation
 
New folderIMAG2318.jpgNew folderIMAG2319.jpgNew folder.docx
New folderIMAG2318.jpgNew folderIMAG2319.jpgNew folder.docxNew folderIMAG2318.jpgNew folderIMAG2319.jpgNew folder.docx
New folderIMAG2318.jpgNew folderIMAG2319.jpgNew folder.docx
 
Parameter Estimation of Software Reliability Growth Models Using Simulated An...
Parameter Estimation of Software Reliability Growth Models Using Simulated An...Parameter Estimation of Software Reliability Growth Models Using Simulated An...
Parameter Estimation of Software Reliability Growth Models Using Simulated An...
 
Presenting an Algorithm for Tasks Scheduling in Grid Environment along with I...
Presenting an Algorithm for Tasks Scheduling in Grid Environment along with I...Presenting an Algorithm for Tasks Scheduling in Grid Environment along with I...
Presenting an Algorithm for Tasks Scheduling in Grid Environment along with I...
 
A report on designing a model for improving CPU Scheduling by using Machine L...
A report on designing a model for improving CPU Scheduling by using Machine L...A report on designing a model for improving CPU Scheduling by using Machine L...
A report on designing a model for improving CPU Scheduling by using Machine L...
 
Bd36334337
Bd36334337Bd36334337
Bd36334337
 
IoT Device Intelligence & Real Time Anomaly Detection
IoT Device Intelligence & Real Time Anomaly DetectionIoT Device Intelligence & Real Time Anomaly Detection
IoT Device Intelligence & Real Time Anomaly Detection
 
501 183-191
501 183-191501 183-191
501 183-191
 
PERFORMANCE COMPARISON OF TWO CONTROLLERS ON A NONLINEAR SYSTEM
PERFORMANCE COMPARISON OF TWO CONTROLLERS ON A NONLINEAR SYSTEMPERFORMANCE COMPARISON OF TWO CONTROLLERS ON A NONLINEAR SYSTEM
PERFORMANCE COMPARISON OF TWO CONTROLLERS ON A NONLINEAR SYSTEM
 
ISTQB Advanced Study Guide - 4
ISTQB Advanced Study Guide - 4ISTQB Advanced Study Guide - 4
ISTQB Advanced Study Guide - 4
 
Cost Estimation methods
Cost Estimation methodsCost Estimation methods
Cost Estimation methods
 
DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...
DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...
DESIGN OF PID CONTROLLERS INTEGRATOR SYSTEM WITH TIME DELAY AND DOUBLE INTEGR...
 
Cost Analysis of ComFrame: A Communication Framework for Data Management in ...
Cost Analysis of ComFrame: A Communication Framework for  Data Management in ...Cost Analysis of ComFrame: A Communication Framework for  Data Management in ...
Cost Analysis of ComFrame: A Communication Framework for Data Management in ...
 
Types
TypesTypes
Types
 
Testing type
Testing typeTesting type
Testing type
 
ASP UNIT 1 QUESTIONBANK ANSWERS.pdf
ASP UNIT 1 QUESTIONBANK ANSWERS.pdfASP UNIT 1 QUESTIONBANK ANSWERS.pdf
ASP UNIT 1 QUESTIONBANK ANSWERS.pdf
 
ASP UNIT 1 QUESTIONBANK ANSWERS (1).pdf
ASP UNIT 1 QUESTIONBANK ANSWERS (1).pdfASP UNIT 1 QUESTIONBANK ANSWERS (1).pdf
ASP UNIT 1 QUESTIONBANK ANSWERS (1).pdf
 

Component Balancer (long)

  • 1. The Component Balancer: Optimization of Component-Based Applications Jim Fontana, Viraj Byakod Unisys Corporation Jim.Fontana@unisys.com, Viraj.Byakod@unisys.com Abstract The Component Balancer establishes and maintains response time goals for selected business logic contained in methods in component-based applications without the need to modify application code. The premise for this optimization is that meaningful business logic in component methods can be driven to response time goals by controlling the calling rate of other methods, based on analysis of the application workload. For example, if two methods are accessing the same database, one method’s response time can be improved if the other method is selected to withstand longer delays. Other benefits of the Component Balancer include business level control and optimization and, for heavily loaded systems, smoother resource utilization and increased scalability. 1. Introduction The Component Balancer establishes and maintains response time goals on business logic contained in component methods. As the load on the system varies, the Component Balancer attempts to keep methods with response time goals at a specified target performance level at the expense of lower priority methods. Additional benefits of the Component Balancer are business-level control and optimization and, for overloaded systems, smoother resource utilization and increased scalability. The case studies below expand on these benefits. The Component Balancer has two phases of operation: the analysis phase provides optimization recommendations and the optimization phase, using these recommendations, self-tunes under variable load. Additionally, the Component Balancer works across machines, applications and application servers and requires no changes to the application. A key capability is the ability to wrap component methods with “conditioning” code. The inserted conditioning code is called a conditioner. Supported components, such as COM+, are called managed types. Conditioners and managed types are plug-ins to a framework. Conditioners can be added and removed while an application is running. No component source code is needed and no recompilation or reconfiguration of the components is required. Multiple conditioners can be applied to a method. In addition to executing pre and post processing code, conditioners have access to the parameters of a method, can capture and throw exceptions, can bypass the method execution and can abort lower priority conditioners. The overhead is minimal - 35 microseconds per method call plus the cost of conditioners on a 700 Mhz system. 2. Analysis and Optimization Details The details of the analysis and optimization algorithms are given in this section. One of the design goals was to expose as few “moving parts” as possible to the end user – keeping the parameters that control the optimization process hidden and letting the self-tuning do the work. As we gain a better understanding of the optimization process, some of these parameters may be exposed for an advanced user. 2.1 Analysis Detail Analysis occurs over a time interval that is specified by the user. The overall analysis time is divided into a number of analysis periods. Each period is treated as an independent set of statistics representing the system as observed during that period. Analysis results during each period are accumulated into an analysis report. During each analysis period, pair wise calculations are done on methods to determine which methods are affected by other methods. For example, deploying analysis conditioning for methods A, B and C result in pairs AB, BA, AC, CA, BC and CB being analyzed. A statistical Analysis of Variance is then applied for each of the pairs to determine if the average response times differ depending on the coincident execution of the methods. For example, when pair AB is analyzed, the average response time for all calls to method A during the sample period when it ran “by itself” is calculated, that is, SPECTS '04 363 ISBN: 1-56555-284-9
  • 2. without method B running at the same time. Then the average response time is calculated for all calls to method A during the sample period when it ran overlapped with method B. Method A must start after method B to be considered as overlapped. This is based on the assumption that the first method “in” will get any contended resource. The individual measurements of the response times are used to calculate the F statistic. It is used to determine if the mean values of two random variables are statistically different, assuming a level of significance. The statistic is useful because it applies a proven technique to determine equality in an area of uncertainty. For example, if the two random variables each have a high level of variance, the hypothesis that their means are equal will be a function of the difference between their statistical means and also the amount of variance. We can accept or reject the hypothesis that the random variables have equal means at a given level of “significance.” Therefore, given a level of significance we can then determine if the average response time for method A during the sampling interval is significantly changed when running coincidentally with method B. For COM+, methods that invoke other methods in a calling sequence are identified and not considered. If A calls B (or is in a calling sequence chain with A), B affects the performance of A, but delaying B will not improve the performance of A. Since the method overlap algorithm is not symmetrical, the F-value is also calculated for method B to see if it is being affected by method A. The total number of calls to A, the number of calls to A only and number of calls to A while overlapped with B and the F-value for each method pair is stored in the raw analysis report during each period. The final analysis report is created from the raw analysis report as a result of two steps. The first step generates an intermediate F statistic for each method pair and accumulates the number of statistically significant method pairs. From experimentation, it was found that a 67% level of significance could yield improvements in the system. Sample sizes for non-overlapped and overlapped operations must each be greater than 10. Therefore, if the average response times are statistically different, given a significance level of 67%, we have found that there is a possibility that the performance of one method can be improved by delaying the other. The level of significance has the single biggest affect on the analysis results. Raising the value too high will eliminate potentially valid results while giving more credence to results that are significant. A value that is too low will most likely introduce some results that are not valid or result in little benefit. The second step involves processing the method pairs. Significant differences for methods affected by other methods are considered positive and are assigned a score of +1. Significant differences for methods affecting other methods are considered negative and assigned a score of -1. A number of independent time samples are used in the analysis process. During the second step, the scores from each of these time samples are accumulated. If the resulting value is positive, the method is considered a candidate for optimization. If the resulting value is negative then the method is considered a candidate for delay. A simple COM+ application was developed to test and verify analysis results. The application has one component with four methods: getSpecialOffers, deposit, withdraw and checkBalance. The application internals were as follows: deposit directly affects checkBalance; withdraw calls checkBalance and getSpecialOffers runs without interfering with or being interfered by other methods. A 25-minute run was made on a laptop, with eight users per method being called at a random rate between 0 and 900 milliseconds on a 700 Mhz system. The results are given in Figure 1 below. Figure 1. Analysis Report The top three lines in the report show the final scores for each method, while the bottom four lines show intermediate scores for the various method pairs. As expected, deposit (-16 = -12 –6 +1 +1) is shown as the only potential delay, with checkBalance (11 = 12 –1) a SPECTS '04 364 ISBN: 1-56555-284-9
  • 3. good potential for optimization, followed by withdraw (5 = 6 –1). Relative numbers like this are expected as withdraw is indirectly affected by deposit. The deployment algorithm starts with the largest positive and largest negative numbers and moves inward, deploying optimization and delay conditioning, until the list is exhausted or until an internal limit is reached on the number of deployed conditioners. A planned enhancement is to consider stopping if there is a large relative gap in the totals of methods in the final report – positive or negative. This would mean that the highest impact conditioning is deployed. Another improvement that is being considered is to take into account methods that affect each other – that is, A affects B and B affects A. In this situation, depending on the distribution of the workload, their final numbers may cancel each other out and make them indistinguishable from methods that just have low final numbers. Another point to note is that a method is not analyzed against itself in the Component Balancer analysis. For other purposes, this analysis could be useful. 2.2 Optimization Detail Response time data from methods being optimized is processed in real time as it comes in from the target machines and placed in a working set. Optimization calculations are done every few seconds for each group in the working set. A delay increment is calculated using fuzzy logic for each method targeted for optimization in the group. The delay increment can be positive or negative. The largest delay increment for a method in the group is chosen and added to the total delay. The total delay is not allowed to exceed a maximum of 300 milliseconds or go below zero. The delay conditioner is given a user selectable maximum delay factor between 1 and 3 that it uses to calculate the delay it actually uses on the method – so the maximum any method can be delayed is 900 milliseconds. The calculated delay is pushed out to the machines where the delayed methods are located by writing a registry key entry on those machines. If no activity has occurred in a working set for 10 seconds – that is, no conditioned methods were called - the delay is set to zero. If one hour elapses with no entries for a method, it is removed from the working set. The fuzzy logic takes three parameters – the current average response time, the current average calls per second (cps) and the previous average calls per second. The two average calls per second values are used to determine load trending. All parameters are normalized against the current response time and cps ranges before they are passed to the fuzzy logic The parameters are “fuzzified” – response time is converted to fast, ok and slow and the calls per second are converted to low, medium and high. The fuzzy logic executes a set of rules given these parameters. For example, if the response time is ok and the current cps are high and the previous cps are low, then the fuzzy logic outputs a positive delay increment. This means that, since the load has increased, a response time increase is anticipated and a bigger delay most likely is needed. The delay value is then “defuzzified”, that is, converted back to a number in the accepted range and returned to the optimizer. The delay increment moves over its allowed range based on how “ok” the response time is, how “high” the current cps are and how “low” the previous cps was. The target response time is the current minimum response time if the maximum goal is chosen. No optimization is done if no goal is chosen. For any goal in-between, the target response time is in a relative range between the current lower and upper response times. For example, if the goal is 75% and the response time range is between 50 to 150 milliseconds, then the response time goal will be 75 milliseconds. The upper response time value and lower cps value are reset every minute. Doing this forces the optimization to squeeze towards an optimum target response time. The lower response time value ratchets down and indicates the current best value that the method can run at. The upper cps value ratchets up and indicates the current best throughput for the method. The cost of Component Balancer conditioning in COM+ is approximately 900 microseconds for each analyzed method, 200 microseconds for each delayed method and 300 microseconds for each optimized method. 3. Case Studies In all case studies, optimization was based on analysis results. 3.1 ISD Benchmark The ISD (Inventory, Sales and Deliver) benchmark, developed at Unisys, is loosely based on TPC-C and is used for MS SQL Server and Oracle database tuning. This benchmark uses an in-process COM+ component SPECTS '04 365 ISBN: 1-56555-284-9
  • 4. with 7 methods to drive the database and comes with a driver script. We ran the benchmark with a 2 processor 500 Mhz database server running SQL Server and a 2 processor 1 GHz middle-tier machine. We did an analysis run and deployed the recommended optimizations with a maximum optimization goal. Only one method, ItemReport, was running at a high response time, all other methods were well below 500 milliseconds. With one method delayed, the ItemReport average response time dropped from 5.38 to 2.06 seconds (161%) with a 5% overall loss in throughput. With two methods delayed, the ItemReport response time dropped to 1.06 seconds (408%) with an overall loss in throughput of 12%. The CPU utilization on the database server dropped from 100% to 93% in the one method delayed case and to 85% in the two methods delayed case. This optimization trades throughput for improved response time. The other, non-delayed methods were also improved by the optimization. By allocating one more processor to the database server, the ItemReport response time went back to its “normal” sub-second value. The business benefit here is to take that extra processor on a consolidated database server and use it for some other purpose and utilize the Component Balancer to manage ISD performance during peak loads. 3.2 Nile Bookstore A case study was done using the Doculabs-developed Nile benchmark, which simulates an online bookstore web application. Measurements were done using Microsoft’s Web Application Stress tool to drive a 4 processor, 700 Mhz system running IIS 5 using the Nile C++ COM+ components (2 components with 14 total methods) and a back-end MS SQL Server database on a 4 processor, 548 Mhz system. We drove the Nile benchmark with scripts that had a 4:1 ratio of browse versus browse and buy transactions. The browsing workload was equally split between short and long browses and the buying workload was equally split between users that buy 1 book and buy 5 books. Results from a 200 user run are given in Figure 2 below. Our optimization, being downstream from IIS, which is the bottleneck in this configuration, does not significantly change the performance profile - 16% improvement for the logon response time and 6% more throughput. However, buyers now have priority over browsers. This shows that we can control the where time is spent in the application in a meaningful fashion. 0 2 4 6 8 10 Short Browse Long Browse Buy 1 Book Buy 5 Books Average Time Per Script (sec) Unoptimized Optimized Figure 2. Nile Response Times By Script The normal configuration for the Nile would be a load- balanced middle-tier server farm that puts a larger load onto MS SQL Server. To simulate this, we moved the backend database to an underpowered server (a one processor 400 Mhz system) and injected extra load to simulate a “medium” and “heavy” MS SQL Server load. Now, response times are affected by MS SQL Server performance and optimization can have a larger effect. Results in Figure 3 below are for 100 users. 0 50 100 150 200 250 300 Medium Heavy Database Load Welcome Screen Avg Response Time (ms) Unoptimized Optimized Optimized+ Figure 3. Optimization Results with Database Bottleneck The “Optimize+” runs added a custom conditioner written to cache the results of the GetSpecials method – which returns the list of bookstore specials and is always displayed on the welcome screen after logon. Both optimizations improve the response times of the first two screens the user sees without affecting the overall throughput (logon screen response times are not shown). SPECTS '04 366 ISBN: 1-56555-284-9
  • 5. Just deploying the cache GetSpecials custom conditioner by itself resulted in a drop in the number of transactions the database had to perform as seen in Figure 4 below. Cache GetSpecials conditioner deployed here. Figure 4. Caching GetSpecials Custom Conditioner We also did testing that overloaded the database. This caused the database to thrash and erratic behavior to appear. The Purchase method randomly spiked up to one second and its call rate became quite variable. With optimization, we were able to keep the Purchase response time at 60 milliseconds with a steady call rate, drop the CPU utilization on the database server from 100% to 93% and increase the number of database transactions per second by 26%. In this scenario, the Component Balancer smoothed response times and scaled database performance during an unexpected surge in load. 4. Conclusion The Component Balancer improves response times of high priority methods when under load. Additionally, it can change and improve the business level profile of an application. The Component Balancer is available as part of the Application Sentinel for Resource Management product from Unisys Corporation. Acknowledgments I would like to acknowledge the following people: Russ Cole and Praveen Nagarajan – initial prototyping and feasibility Ron Neubauer and Paul Koerber – automation Mark Tadman and Peter Partch – COM+ conditioning Jack Chang and Joyce Liu – Weblogic conditioning Bob Walker – Nile investigation and set up Doug Tolbert – statistical consultation Michael Salsburg – detailed review and editorial advice References [1] Brown, Keith, “Building a Lightweight COM Interception Framework, Part I: The Universal Delegator”, Microsoft Systems Journal, vol. 14, pp. 17-29, January 1999. [2] Brown, Keith, “Building a Lightweight COM Interception Framework, Part II: The Guts of the UD”, Microsoft Systems Journal, vol. 14, pp. 49-59, February 1999. [3] Robert E. Filman, “Applying Aspect-Oriented Programming to Intelligent Synthesis”, Workshop on Aspects and Dimensions of Concerns, 14th European Conference on Object-Oriented Programming, Cannes, France, June 2000. [4] Ribler, R. L., H. Simitci, D. A. Reed, “The Autopilot Performance-Directed Adaptive Control System”, Future Generation Computer Systems, special issue of Performance Data Mining, Volume 18, Number 1, 2001, pp. 175-187. [5] Zona Research, “The Economic Impacts of Unacceptable Web-Site Download Speeds”, Zona Market Bulletin, Redwood City, California, Zona Research, 1999. SPECTS '04 367 ISBN: 1-56555-284-9