Issre2010 malik

•Download as PPTX, PDF•

0 likes•128 views

SAIL_QU

Pinpointing the Subsystems Responsible for
Performance Deviations In a Load Test
1
Haroon Malik, Bram Adams & Ahmed E. Hassan
Software Analysis and Intelligence Lab (SAIL)
Queen’s University, Kingston, Canada

Large scale systems need to satisfy
performance constraints
2

TIME SPENT…..
3
ANALYSTS SPENT CONSIDERABLE TIME
DEALING WITH PERFORMANCE BUGS

Environment Setup Load test execution Load test analysis Report generation
CURRENT PRACTICE
1 2 3 4
5

Environment Setup Load test execution Load test analysis Report generation
CURRENT PRACTICE
1 2 3 4
6

2. LOAD TEST EXECUTION
MONITORING TOOL
LOAD GENERATOR- 1
SYSTEM PERFORMANCE
REPOSITORY
LOAD GENERATOR- 2
7

Environment Setup Load test execution Load test analysis Report generation
CURRENT PRACTICE
1 2 3 4
8

Environment Setup Load test execution Load test analysis Report generation
CURRENT PRACTICE
1 2 3 4
9

LOAD TEST
 PASS
FAIL
x
10
3. LOAD TEST ANALYSIS

WE CAN HELP ANALYSTS:
Decide if a performance test passed or failed
CSMR 2010
Identify the subsystems which violated the
performance COMPSAC 2010
Pinpoint the subsystems that are likely the
cause of performance violation ISSRE 2010
15

Automated
Methodology to
PINPOIN
T
Likely Cause of 16

METHODOLOGY STEPS
1 2 3 4
DataPreparation
CraftingPerformance
Signatures
Identifying Deviations Pinpointing
17

PERFORMANCE COUNTERS
ARE HIGHLY CORRELAED
CPU DISK (IOPS)
NETWORK
MEMORY
TRANSACTONS/SEC
18

 Principal Component Analysis (PCA)
 Explains most of the counter data with minimal
information loss
 Removes the noise in the counter data
 Influential Counters
Counter Elimination: Norman cut-off criteria
 Counter Ranking
2. CRAFTING PERFORMANCE SIGNATURES
19

Ide
Commits/Sec
Writes/Sec
CPU Utilization
Database Cache % Hit
Subsystems Base-Line Load Test - 1 DeviationMatch
0.41
0
0.01

PINPOINTING
Avg. Pair wise correlation ()
SUB- A SUB-B
SUB-C SUB-D
0.8
0.7
0.9
0.7

4. PINPOINTING
SUB- A SUB-B
SUB-C SUB-D
0.8
0.7
0.9
0.7
Avg Pair-wise correlation ()
SUB Load
A 0.75
B 0.77
C 0.80
D 0.77
22

4. PINPOINTING
SUB- A SUB-B
SUB-C SUB-D
0.8
0.7
0.9
0.7
Avg Pair-wise correlation ()
SUB Load Baseline Dev %
A 0.75 0.87 0.12 13.0
B 0.77 0.82 0.05 6.01
C 0.80 0.94 0.14 14.8
D 0.77 0.88 0.11 12.5
23

4. PINPOINTING
SUB- A SUB-B
SUB-C SUB-D
0.8
0.7
0.9
0.7
Avg Pair-wise correlation () Pinpointed
SUB Load Baseline Dev %
C 0.80 0.94 0.14 14.8
A 0.75 0.87 0.12 13.0
D 0.77 0.88 0.11 12.5
B 0.77 0.82 0.05 6.01
24

DELL DVD STORE
26
Components of Test Environment
Load
GeneratorDatabase
Server
Web Server (B)
Web Server (C)
Web Server (A)
LoadGenerators
Performance
Logs
Performance Monitoring Tool

Base 4-X Dev %
Web-1 0.87 0.72 0.15 17.1
Web-2 0.88 0.82 0.05 6.46
Web-3 0.89 0.83 0.06 7.03
DB 0.78 0.73 0.05 6.92
EXPERIMENT-1
4X- LOAD ON WEB-1
Base CPU Dev %
Web-1 0.87 0.69 0.18 21.1
Web-2 0.88 0.80 0.08 9.29
Web-3 0.89 0.80 0.09 10.6
DB 0.78 0.73 0.05 7.24
Base CPU Dev %
Web-1 0.87 0.83 0.03 4.28
Web-2 0.88 0.83 0.04 5.04
Web-3 0.89 0.84 0.05 6.14
DB 0.78 0.78 0.08 10.4
Base MEM Dev %
Web-1 0.87 0.81 0.06 7.42
Web-2 0.88 0.75 0.13 14.9
Web-3 0.89 0.81 0.08 9.49
DB 0.78 0.7 0.087 11.0
EXPERIMENT-2
CPU STRESS ON WEB-1
EXPERIMENT-3
CPU STRESS ON DB
EXPERIMENT-4
MEMORY STRESS ON WEB-2
27

0.4
0.5
0.6
0.7
0.8
0.9
1.0
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
155
162
CounterImportance
Performance Counters
Test- A (Base-line) Test-B Test-C Test-D Test-E
ENTERPRISE APPLICATION
28

WHY ARE TEST-D & E DIFFERENT?
 PINPOINTED – Among 6 subsystems for each
test, web server is likely the cause of
performance deviation in both tests
 Signature counters of web server notably
deviated from the baseline:
The packet Outbound Discarded,
Packet Sent/Sec
Message Queue Length
Network Problem – No connectivity to the
database
Web server Under Stress 30

LIMITATIONS
 Our methodology can only point *A*
subsystem that is likely the cause of
performance deviation for a load test.
 Our methodology can not be generalized to
other domains such as network traffic and
security monitoring.
31

Similar to Issre2010 malik

XenApp Load BalancingDenis Gundarev

Continuous Performance Testing for MicroservicesVincenzo Ferme

Designing apps for resiliencyMasashi Narumoto

Bottlenecks rel b works and rel c planningJun Li

How to make a Load Testing with Visual Studio 2012Chen-Tien Tsai

Successful Software Development with Apache CassandraDataStax Academy

Software Development with Apache Cassandrazznate

Programming with Relaxed Synchronizationracesworkshop

Upgrading 11i E-business Suite to R12 E-business SuiteiWare Logic Technologies Pvt. Ltd.

Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)Peter Tröger

Microsoft Solutions Partner in USA DynaTechhenrryfor680

Oracle appsloadtestbestpracticessonusaini69

UPGRADING FROM ORACLE ENTERPRISE MANAGER 10G TO CLOUD CONTROL 12C WITH ZERO D...Leighton Nelson

Webinar: Automating the Creation and Use of Virtual Testing Environments Skytap Cloud

Windows Azure Acid Testexpanz

Adding Value in the Cloud with Performance TestRodolfo Kohn

Continuous performance testingSQALab

12c r1 upgrade_companion_300Sarfraz Khan

VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...VMworld

VMworld 2013: The Missing Link: Storage Visibility In Virtualized Environments VMworld

Similar to Issre2010 malik (20)

XenApp Load Balancing

Continuous Performance Testing for Microservices

Designing apps for resiliency

Bottlenecks rel b works and rel c planning

How to make a Load Testing with Visual Studio 2012

Successful Software Development with Apache Cassandra

Software Development with Apache Cassandra

Programming with Relaxed Synchronization

Upgrading 11i E-business Suite to R12 E-business Suite

Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)

Microsoft Solutions Partner in USA DynaTech

Oracle appsloadtestbestpractices

UPGRADING FROM ORACLE ENTERPRISE MANAGER 10G TO CLOUD CONTROL 12C WITH ZERO D...

Webinar: Automating the Creation and Use of Virtual Testing Environments

Windows Azure Acid Test

Adding Value in the Cloud with Performance Test

Continuous performance testing

12c r1 upgrade_companion_300

VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...

VMworld 2013: The Missing Link: Storage Visibility In Virtualized Environments

More from SAIL_QU

Studying the Integration Practices and the Evolution of Ad Libraries in the G...SAIL_QU

Studying the Dialogue Between Users and Developers of Free Apps in the Google...SAIL_QU

Improving the testing efficiency of selenium-based load testsSAIL_QU

Studying User-Developer Interactions Through the Distribution and Reviewing M...SAIL_QU

Studying online distribution platforms for games through the mining of data f...SAIL_QU

Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...SAIL_QU

Investigating the Challenges in Selenium Usage and Improving the Testing Effi...SAIL_QU

Mining Development Knowledge to Understand and Support Software Logging Pract...SAIL_QU

Which Log Level Should Developers Choose For a New Logging Statement?SAIL_QU

Towards Just-in-Time Suggestions for Log ChangesSAIL_QU

The Impact of Task Granularity on Co-evolution AnalysesSAIL_QU

A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...SAIL_QU

How are Discussions Associated with Bug Reworking? An Empirical Study on Open...SAIL_QU

A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...SAIL_QU

A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...SAIL_QU

Studying the Dialogue Between Users and Developers of Free Apps in the Google...SAIL_QU

What Do Programmers Know about Software Energy Consumption?SAIL_QU

Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...SAIL_QU

Revisiting the Experimental Design Choices for Approaches for the Automated R...SAIL_QU

Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsSAIL_QU

More from SAIL_QU (20)

Studying the Integration Practices and the Evolution of Ad Libraries in the G...

Studying the Dialogue Between Users and Developers of Free Apps in the Google...

Improving the testing efficiency of selenium-based load tests

Studying User-Developer Interactions Through the Distribution and Reviewing M...

Studying online distribution platforms for games through the mining of data f...

Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...

Investigating the Challenges in Selenium Usage and Improving the Testing Effi...

Mining Development Knowledge to Understand and Support Software Logging Pract...

Which Log Level Should Developers Choose For a New Logging Statement?

Towards Just-in-Time Suggestions for Log Changes

The Impact of Task Granularity on Co-evolution Analyses

A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...

How are Discussions Associated with Bug Reworking? An Empirical Study on Open...

A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...

A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...

Studying the Dialogue Between Users and Developers of Free Apps in the Google...

What Do Programmers Know about Software Energy Consumption?

Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...

Revisiting the Experimental Design Choices for Approaches for the Automated R...

Measuring Program Comprehension: A Large-Scale Field Study with Professionals

Issre2010 malik

1. Pinpointing the Subsystems Responsible for Performance Deviations In a Load Test 1 Haroon Malik, Bram Adams & Ahmed E. Hassan Software Analysis and Intelligence Lab (SAIL) Queen’s University, Kingston, Canada

2. Large scale systems need to satisfy performance constraints 2

3. TIME SPENT….. 3 ANALYSTS SPENT CONSIDERABLE TIME DEALING WITH PERFORMANCE BUGS

4. 4

5. Environment Setup Load test execution Load test analysis Report generation CURRENT PRACTICE 1 2 3 4 5

6. Environment Setup Load test execution Load test analysis Report generation CURRENT PRACTICE 1 2 3 4 6

7. 2. LOAD TEST EXECUTION MONITORING TOOL LOAD GENERATOR- 1 SYSTEM PERFORMANCE REPOSITORY LOAD GENERATOR- 2 7

8. Environment Setup Load test execution Load test analysis Report generation CURRENT PRACTICE 1 2 3 4 8

9. Environment Setup Load test execution Load test analysis Report generation CURRENT PRACTICE 1 2 3 4 9

10. LOAD TEST  PASS FAIL x 10 3. LOAD TEST ANALYSIS

11. LARGE NUMBER OF PERFORMANCE COUNTERS 12

12. LIMITED TIME 13

13. LIMITED KNOWLEDGE 14

14. WE CAN HELP ANALYSTS: Decide if a performance test passed or failed CSMR 2010 Identify the subsystems which violated the performance COMPSAC 2010 Pinpoint the subsystems that are likely the cause of performance violation ISSRE 2010 15

15. Automated Methodology to PINPOIN T Likely Cause of 16

16. METHODOLOGY STEPS 1 2 3 4 DataPreparation CraftingPerformance Signatures Identifying Deviations Pinpointing 17

17. PERFORMANCE COUNTERS ARE HIGHLY CORRELAED CPU DISK (IOPS) NETWORK MEMORY TRANSACTONS/SEC 18

18.  Principal Component Analysis (PCA)  Explains most of the counter data with minimal information loss  Removes the noise in the counter data  Influential Counters Counter Elimination: Norman cut-off criteria  Counter Ranking 2. CRAFTING PERFORMANCE SIGNATURES 19

19. Ide Commits/Sec Writes/Sec CPU Utilization Database Cache % Hit Subsystems Base-Line Load Test - 1 DeviationMatch 0.41 0 0.01

20. PINPOINTING Avg. Pair wise correlation () SUB- A SUB-B SUB-C SUB-D 0.8 0.7 0.9 0.7

21. 4. PINPOINTING SUB- A SUB-B SUB-C SUB-D 0.8 0.7 0.9 0.7 Avg Pair-wise correlation () SUB Load A 0.75 B 0.77 C 0.80 D 0.77 22

22. 4. PINPOINTING SUB- A SUB-B SUB-C SUB-D 0.8 0.7 0.9 0.7 Avg Pair-wise correlation () SUB Load Baseline Dev % A 0.75 0.87 0.12 13.0 B 0.77 0.82 0.05 6.01 C 0.80 0.94 0.14 14.8 D 0.77 0.88 0.11 12.5 23

23. 4. PINPOINTING SUB- A SUB-B SUB-C SUB-D 0.8 0.7 0.9 0.7 Avg Pair-wise correlation () Pinpointed SUB Load Baseline Dev % C 0.80 0.94 0.14 14.8 A 0.75 0.87 0.12 13.0 D 0.77 0.88 0.11 12.5 B 0.77 0.82 0.05 6.01 24

24. 25

25. DELL DVD STORE 26 Components of Test Environment Load GeneratorDatabase Server Web Server (B) Web Server (C) Web Server (A) LoadGenerators Performance Logs Performance Monitoring Tool

26. Base 4-X Dev % Web-1 0.87 0.72 0.15 17.1 Web-2 0.88 0.82 0.05 6.46 Web-3 0.89 0.83 0.06 7.03 DB 0.78 0.73 0.05 6.92 EXPERIMENT-1 4X- LOAD ON WEB-1 Base CPU Dev % Web-1 0.87 0.69 0.18 21.1 Web-2 0.88 0.80 0.08 9.29 Web-3 0.89 0.80 0.09 10.6 DB 0.78 0.73 0.05 7.24 Base CPU Dev % Web-1 0.87 0.83 0.03 4.28 Web-2 0.88 0.83 0.04 5.04 Web-3 0.89 0.84 0.05 6.14 DB 0.78 0.78 0.08 10.4 Base MEM Dev % Web-1 0.87 0.81 0.06 7.42 Web-2 0.88 0.75 0.13 14.9 Web-3 0.89 0.81 0.08 9.49 DB 0.78 0.7 0.087 11.0 EXPERIMENT-2 CPU STRESS ON WEB-1 EXPERIMENT-3 CPU STRESS ON DB EXPERIMENT-4 MEMORY STRESS ON WEB-2 27

27. 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134 141 148 155 162 CounterImportance Performance Counters Test- A (Base-line) Test-B Test-C Test-D Test-E ENTERPRISE APPLICATION 28

28. WHY ARE TEST-D & E DIFFERENT? 29

29. WHY ARE TEST-D & E DIFFERENT?  PINPOINTED – Among 6 subsystems for each test, web server is likely the cause of performance deviation in both tests  Signature counters of web server notably deviated from the baseline: The packet Outbound Discarded, Packet Sent/Sec Message Queue Length Network Problem – No connectivity to the database Web server Under Stress 30

30. LIMITATIONS  Our methodology can only point *A* subsystem that is likely the cause of performance deviation for a load test.  Our methodology can not be generalized to other domains such as network traffic and security monitoring. 31

31. 32

Editor's Notes

So at SAIL lab, we thrive investigating approaches and creating techniques to support practitioners who are producing, maintaining and evolving large scale complex software systems. We aim to develop tools to make the lives of these practitioners more productive, more cheerful and more predictable.
Today's LSS such as Google, eBay ,Facebook and Amazon are composed of many underlying components and subsystems. These LSS are tremendously growing in size to handle growing traffic, complex services and business critical functionality. It is very important to periodically measure the performance of such LLS to satisfy the stakeholders and customers high demands on system quality, availability and responsiveness.
-Load testing is an important weapon in LSS development to uncover functional and performance problems of a system under load Performance of the LSS is calibrated using Load test before it becomes a field or post-deployment problem. -Performance problems include an application not responding fast enough, crashing or hanging under heavy load or not meeting the desired service level agreements (SLA).
Environment-Setup: First but the most important phase of the load testing. As the most common load test failures occurs due to improper environment setup for load tests. The environment setup includes installing the applications and load testing tools on different machines and possibly on different operating systems Load generators, which emulates the users interaction with the systems, need to carefully configured to match the real workload in field. Load Test Execution: It involves starting the components of the system under test, i.e., starting required services, hardware resources and tools ( load generators and performance monitors). Performance counter are recorded in this step too. Load Test Analysis: This step involves comparing the results of a load test against an other load tests results or against predefined thresh holds as baselines. Unlike functional and unit testing, which results in pass of failure classification for each test; load testing requires additional quantitative metrics like response time, throughput and hardware resources utilization to summarize results. The performance analysts selects few of the important performance counters among thousands collected. Based on his experience and domain knowledge performance analyst manually compare the selected performance counters with those of past runs to look for evidence of performance deviations, for example using plots and performing correlations tests. Report Generation: Includes filing the performance deviations, if found, based on the personal judgment of an analysts. Mostly the results produced are verified by an experience analysts. Based on the extent of performance deviation and its relevance to team responsible for handling the subsystems i.e., (database, application, web system etc.)
Environment-Setup: First but the most important phase of the load testing. As the most common load test failures occurs due to improper environment setup for load tests. The environment setup includes installing the applications and load testing tools on different machines and possibly on different operating systems Load generators, which emulates the users interaction with the systems, need to carefully configured to match the real workload in field. Load Test Execution: It involves starting the components of the system under test, i.e., starting required services, hardware resources and tools ( load generators and performance monitors). Performance counter are recorded in this step too. Load Test Analysis: This step involves comparing the results of a load test against an other load tests results or against predefined thresh holds as baselines. Unlike functional and unit testing, which results in pass of failure classification for each test; load testing requires additional quantitative metrics like response time, throughput and hardware resources utilization to summarize results. The performance analysts selects few of the important performance counters among thousands collected. Based on his experience and domain knowledge performance analyst manually compare the selected performance counters with those of past runs to look for evidence of performance deviations, for example using plots and performing correlations tests. Report Generation: Includes filing the performance deviations, if found, based on the personal judgment of an analysts. Mostly the results produced are verified by an experience analysts. Based on the extent of performance deviation and its relevance to team responsible for handling the subsystems i.e., (database, application, web system etc.)
Environment-Setup: First but the most important phase of the load testing. As the most common load test failures occurs due to improper environment setup for load tests. The environment setup includes installing the applications and load testing tools on different machines and possibly on different operating systems Load generators, which emulates the users interaction with the systems, need to carefully configured to match the real workload in field. Load Test Execution: It involves starting the components of the system under test, i.e., starting required services, hardware resources and tools ( load generators and performance monitors). Performance counter are recorded in this step too. Load Test Analysis: This step involves comparing the results of a load test against an other load tests results or against predefined thresh holds as baselines. Unlike functional and unit testing, which results in pass of failure classification for each test; load testing requires additional quantitative metrics like response time, throughput and hardware resources utilization to summarize results. The performance analysts selects few of the important performance counters among thousands collected. Based on his experience and domain knowledge performance analyst manually compare the selected performance counters with those of past runs to look for evidence of performance deviations, for example using plots and performing correlations tests. Report Generation: Includes filing the performance deviations, if found, based on the personal judgment of an analysts. Mostly the results produced are verified by an experience analysts. Based on the extent of performance deviation and its relevance to team responsible for handling the subsystems i.e., (database, application, web system etc.)
Environment-Setup: First but the most important phase of the load testing. As the most common load test failures occurs due to improper environment setup for load tests. The environment setup includes installing the applications and load testing tools on different machines and possibly on different operating systems Load generators, which emulates the users interaction with the systems, need to carefully configured to match the real workload in field. Load Test Execution: It involves starting the components of the system under test, i.e., starting required services, hardware resources and tools ( load generators and performance monitors). Performance counter are recorded in this step too. Load Test Analysis: This step involves comparing the results of a load test against an other load tests results or against predefined thresh holds as baselines. Unlike functional and unit testing, which results in pass of failure classification for each test; load testing requires additional quantitative metrics like response time, throughput and hardware resources utilization to summarize results. The performance analysts selects few of the important performance counters among thousands collected. Based on his experience and domain knowledge performance analyst manually compare the selected performance counters with those of past runs to look for evidence of performance deviations, for example using plots and performing correlations tests. Report Generation: Includes filing the performance deviations, if found, based on the personal judgment of an analysts. Mostly the results produced are verified by an experience analysts. Based on the extent of performance deviation and its relevance to team responsible for handling the subsystems i.e., (database, application, web system etc.)
- Unfortunately, the current practice to analyze load test is costly, time consuming and error prone. This is due to the fact that the load test analysis practices have not kept pace with the rapid growth in size and complexity of the large enterprise systems. In practice, the dominant tools and techniques to analyze large distributed systems have remained unchanged for over twenty years Most of the research had focus on the automatic generation of load testing suits rather then load test analysis - There are many challenges and limitation associated with the current practices of load test analysis that remains unsolved
- Last from computer of hours to several days. They generate performance logs that can be of terra bytes in size Even logging all counters on typical machine at 1Hz generates about 8.6 million values in a single weeks A cluster of 12 machine over a week –13 TB of performance counter data per week. Assuming 64 bit representation for each counter value. Analysis of such large counter log is still a bit challenge in load tests.
Performance analysts in LSS have only limited time to reach and complete diagnostics on performance counter logs and to make necessary configuration changes. Load testing is usually the last step in an already tight and usually delayed release schedule. Hence, managers are always eager to reduce the time allocated for performance testing.
Error prone because of manual process involved in analyzing performance counter data in current practice Impossible for an analyst to skim through large volume of log data, indeed they analyst use few key performance counters know to them from past practices, performance experts and domain trends as ‘ rule of thumbs’. With large scale system that are continuously being evlved by adding new functionalities, applying same rules of thumb can mislead performance issues.
Due to these challenges, we believe the current practice to perform load test analysis is neither effective nor sufficient to uncover performance deviations accurately and in limited time. -
1) The performance log obtained from a load test do not suffice for direct analysis by our methodology. These logs need to be prepared to make them suitable for statistical techniques employed by our methodology. In this step take care of data sanitization (missing counter variables and incomplete counter variables) and pre-treatment of data such as standardization and data scaling to remove the biasing of variance depended techniques.

Issre2010 malik

Recommended

Recommended

More Related Content

Similar to Issre2010 malik

Similar to Issre2010 malik (20)

More from SAIL_QU

More from SAIL_QU (20)

Issre2010 malik

Editor's Notes