SlideShare a Scribd company logo
1 of 31
Pinpointing the Subsystems Responsible for
Performance Deviations In a Load Test
1
Haroon Malik, Bram Adams & Ahmed E. Hassan
Software Analysis and Intelligence Lab (SAIL)
Queen’s University, Kingston, Canada
Large scale systems need to satisfy
performance constraints
2
TIME SPENT…..
3
ANALYSTS SPENT CONSIDERABLE TIME
DEALING WITH PERFORMANCE BUGS
4
Environment Setup Load test execution Load test analysis Report generation
CURRENT PRACTICE
1 2 3 4
5
Environment Setup Load test execution Load test analysis Report generation
CURRENT PRACTICE
1 2 3 4
6
2. LOAD TEST EXECUTION
MONITORING TOOL
LOAD GENERATOR- 1
SYSTEM PERFORMANCE
REPOSITORY
LOAD GENERATOR- 2
7
Environment Setup Load test execution Load test analysis Report generation
CURRENT PRACTICE
1 2 3 4
8
Environment Setup Load test execution Load test analysis Report generation
CURRENT PRACTICE
1 2 3 4
9
LOAD TEST
 PASS
FAIL
x
10
3. LOAD TEST ANALYSIS
LARGE NUMBER OF PERFORMANCE
COUNTERS
12
LIMITED TIME
13
LIMITED KNOWLEDGE
14
WE CAN HELP ANALYSTS:
Decide if a performance test passed or failed
CSMR 2010
Identify the subsystems which violated the
performance COMPSAC 2010
Pinpoint the subsystems that are likely the
cause of performance violation ISSRE 2010
15
Automated
Methodology to
PINPOIN
T
Likely Cause of 16
METHODOLOGY STEPS
1 2 3 4
DataPreparation
CraftingPerformance
Signatures
Identifying Deviations Pinpointing
17
PERFORMANCE COUNTERS
ARE HIGHLY CORRELAED
CPU DISK (IOPS)
NETWORK
MEMORY
TRANSACTONS/SEC
18
 Principal Component Analysis (PCA)
 Explains most of the counter data with minimal
information loss
 Removes the noise in the counter data
 Influential Counters
Counter Elimination: Norman cut-off criteria
 Counter Ranking
2. CRAFTING PERFORMANCE SIGNATURES
19
Ide
Commits/Sec
Writes/Sec
CPU Utilization
Database Cache % Hit
Subsystems Base-Line Load Test - 1 DeviationMatch
0.41
0
0.01
PINPOINTING
Avg. Pair wise correlation ()
SUB- A SUB-B
SUB-C SUB-D
0.8
0.7
0.9
0.7
4. PINPOINTING
SUB- A SUB-B
SUB-C SUB-D
0.8
0.7
0.9
0.7
Avg Pair-wise correlation ()
SUB Load
A 0.75
B 0.77
C 0.80
D 0.77
22
4. PINPOINTING
SUB- A SUB-B
SUB-C SUB-D
0.8
0.7
0.9
0.7
Avg Pair-wise correlation ()
SUB Load Baseline Dev %
A 0.75 0.87 0.12 13.0
B 0.77 0.82 0.05 6.01
C 0.80 0.94 0.14 14.8
D 0.77 0.88 0.11 12.5
23
4. PINPOINTING
SUB- A SUB-B
SUB-C SUB-D
0.8
0.7
0.9
0.7
Avg Pair-wise correlation () Pinpointed
SUB Load Baseline Dev %
C 0.80 0.94 0.14 14.8
A 0.75 0.87 0.12 13.0
D 0.77 0.88 0.11 12.5
B 0.77 0.82 0.05 6.01
24
25
DELL DVD STORE
26
Components of Test Environment
Load
GeneratorDatabase
Server
Web Server (B)
Web Server (C)
Web Server (A)
LoadGenerators
Performance
Logs
Performance Monitoring Tool
Base 4-X Dev %
Web-1 0.87 0.72 0.15 17.1
Web-2 0.88 0.82 0.05 6.46
Web-3 0.89 0.83 0.06 7.03
DB 0.78 0.73 0.05 6.92
EXPERIMENT-1
4X- LOAD ON WEB-1
Base CPU Dev %
Web-1 0.87 0.69 0.18 21.1
Web-2 0.88 0.80 0.08 9.29
Web-3 0.89 0.80 0.09 10.6
DB 0.78 0.73 0.05 7.24
Base CPU Dev %
Web-1 0.87 0.83 0.03 4.28
Web-2 0.88 0.83 0.04 5.04
Web-3 0.89 0.84 0.05 6.14
DB 0.78 0.78 0.08 10.4
Base MEM Dev %
Web-1 0.87 0.81 0.06 7.42
Web-2 0.88 0.75 0.13 14.9
Web-3 0.89 0.81 0.08 9.49
DB 0.78 0.7 0.087 11.0
EXPERIMENT-2
CPU STRESS ON WEB-1
EXPERIMENT-3
CPU STRESS ON DB
EXPERIMENT-4
MEMORY STRESS ON WEB-2
27
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
155
162
CounterImportance
Performance Counters
Test- A (Base-line) Test-B Test-C Test-D Test-E
ENTERPRISE APPLICATION
28
WHY ARE TEST-D & E
DIFFERENT?
29
WHY ARE TEST-D & E DIFFERENT?
 PINPOINTED – Among 6 subsystems for each
test, web server is likely the cause of
performance deviation in both tests
 Signature counters of web server notably
deviated from the baseline:
The packet Outbound Discarded,
Packet Sent/Sec
Message Queue Length
Network Problem – No connectivity to the
database
Web server Under Stress 30
LIMITATIONS
 Our methodology can only point *A*
subsystem that is likely the cause of
performance deviation for a load test.
 Our methodology can not be generalized to
other domains such as network traffic and
security monitoring.
31
32

More Related Content

Similar to Issre2010 malik

Continuous Performance Testing for Microservices
Continuous Performance Testing for MicroservicesContinuous Performance Testing for Microservices
Continuous Performance Testing for MicroservicesVincenzo Ferme
 
Designing apps for resiliency
Designing apps for resiliencyDesigning apps for resiliency
Designing apps for resiliencyMasashi Narumoto
 
Bottlenecks rel b works and rel c planning
Bottlenecks rel b works and rel c planningBottlenecks rel b works and rel c planning
Bottlenecks rel b works and rel c planningJun Li
 
How to make a Load Testing with Visual Studio 2012
How to make a Load Testing with Visual Studio 2012How to make a Load Testing with Visual Studio 2012
How to make a Load Testing with Visual Studio 2012Chen-Tien Tsai
 
Successful Software Development with Apache Cassandra
Successful Software Development with Apache CassandraSuccessful Software Development with Apache Cassandra
Successful Software Development with Apache CassandraDataStax Academy
 
Software Development with Apache Cassandra
Software Development with Apache CassandraSoftware Development with Apache Cassandra
Software Development with Apache Cassandrazznate
 
Programming with Relaxed Synchronization
Programming with Relaxed SynchronizationProgramming with Relaxed Synchronization
Programming with Relaxed Synchronizationracesworkshop
 
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)Peter Tröger
 
Microsoft Solutions Partner in USA DynaTech
Microsoft Solutions Partner in USA DynaTechMicrosoft Solutions Partner in USA DynaTech
Microsoft Solutions Partner in USA DynaTechhenrryfor680
 
Oracle appsloadtestbestpractices
Oracle appsloadtestbestpracticesOracle appsloadtestbestpractices
Oracle appsloadtestbestpracticessonusaini69
 
UPGRADING FROM ORACLE ENTERPRISE MANAGER 10G TO CLOUD CONTROL 12C WITH ZERO D...
UPGRADING FROM ORACLE ENTERPRISE MANAGER 10G TO CLOUD CONTROL 12C WITH ZERO D...UPGRADING FROM ORACLE ENTERPRISE MANAGER 10G TO CLOUD CONTROL 12C WITH ZERO D...
UPGRADING FROM ORACLE ENTERPRISE MANAGER 10G TO CLOUD CONTROL 12C WITH ZERO D...Leighton Nelson
 
Webinar: Automating the Creation and Use of Virtual Testing Environments
Webinar: Automating the Creation and Use of Virtual Testing Environments Webinar: Automating the Creation and Use of Virtual Testing Environments
Webinar: Automating the Creation and Use of Virtual Testing Environments Skytap Cloud
 
Windows Azure Acid Test
Windows Azure Acid TestWindows Azure Acid Test
Windows Azure Acid Testexpanz
 
Adding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance TestAdding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance TestRodolfo Kohn
 
Continuous performance testing
Continuous performance testingContinuous performance testing
Continuous performance testingSQALab
 
12c r1 upgrade_companion_300
12c r1 upgrade_companion_30012c r1 upgrade_companion_300
12c r1 upgrade_companion_300Sarfraz Khan
 
VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...
VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...
VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...VMworld
 
VMworld 2013: The Missing Link: Storage Visibility In Virtualized Environments
VMworld 2013: The Missing Link: Storage Visibility In Virtualized Environments VMworld 2013: The Missing Link: Storage Visibility In Virtualized Environments
VMworld 2013: The Missing Link: Storage Visibility In Virtualized Environments VMworld
 

Similar to Issre2010 malik (20)

XenApp Load Balancing
XenApp Load BalancingXenApp Load Balancing
XenApp Load Balancing
 
Continuous Performance Testing for Microservices
Continuous Performance Testing for MicroservicesContinuous Performance Testing for Microservices
Continuous Performance Testing for Microservices
 
Designing apps for resiliency
Designing apps for resiliencyDesigning apps for resiliency
Designing apps for resiliency
 
Bottlenecks rel b works and rel c planning
Bottlenecks rel b works and rel c planningBottlenecks rel b works and rel c planning
Bottlenecks rel b works and rel c planning
 
How to make a Load Testing with Visual Studio 2012
How to make a Load Testing with Visual Studio 2012How to make a Load Testing with Visual Studio 2012
How to make a Load Testing with Visual Studio 2012
 
Successful Software Development with Apache Cassandra
Successful Software Development with Apache CassandraSuccessful Software Development with Apache Cassandra
Successful Software Development with Apache Cassandra
 
Software Development with Apache Cassandra
Software Development with Apache CassandraSoftware Development with Apache Cassandra
Software Development with Apache Cassandra
 
Programming with Relaxed Synchronization
Programming with Relaxed SynchronizationProgramming with Relaxed Synchronization
Programming with Relaxed Synchronization
 
Upgrading 11i E-business Suite to R12 E-business Suite
Upgrading 11i E-business Suite to R12 E-business SuiteUpgrading 11i E-business Suite to R12 E-business Suite
Upgrading 11i E-business Suite to R12 E-business Suite
 
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
Dependable Systems - Structure-Based Dependabiilty Modeling (6/16)
 
Microsoft Solutions Partner in USA DynaTech
Microsoft Solutions Partner in USA DynaTechMicrosoft Solutions Partner in USA DynaTech
Microsoft Solutions Partner in USA DynaTech
 
Oracle appsloadtestbestpractices
Oracle appsloadtestbestpracticesOracle appsloadtestbestpractices
Oracle appsloadtestbestpractices
 
UPGRADING FROM ORACLE ENTERPRISE MANAGER 10G TO CLOUD CONTROL 12C WITH ZERO D...
UPGRADING FROM ORACLE ENTERPRISE MANAGER 10G TO CLOUD CONTROL 12C WITH ZERO D...UPGRADING FROM ORACLE ENTERPRISE MANAGER 10G TO CLOUD CONTROL 12C WITH ZERO D...
UPGRADING FROM ORACLE ENTERPRISE MANAGER 10G TO CLOUD CONTROL 12C WITH ZERO D...
 
Webinar: Automating the Creation and Use of Virtual Testing Environments
Webinar: Automating the Creation and Use of Virtual Testing Environments Webinar: Automating the Creation and Use of Virtual Testing Environments
Webinar: Automating the Creation and Use of Virtual Testing Environments
 
Windows Azure Acid Test
Windows Azure Acid TestWindows Azure Acid Test
Windows Azure Acid Test
 
Adding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance TestAdding Value in the Cloud with Performance Test
Adding Value in the Cloud with Performance Test
 
Continuous performance testing
Continuous performance testingContinuous performance testing
Continuous performance testing
 
12c r1 upgrade_companion_300
12c r1 upgrade_companion_30012c r1 upgrade_companion_300
12c r1 upgrade_companion_300
 
VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...
VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...
VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...
 
VMworld 2013: The Missing Link: Storage Visibility In Virtualized Environments
VMworld 2013: The Missing Link: Storage Visibility In Virtualized Environments VMworld 2013: The Missing Link: Storage Visibility In Virtualized Environments
VMworld 2013: The Missing Link: Storage Visibility In Virtualized Environments
 

More from SAIL_QU

Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...SAIL_QU
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...SAIL_QU
 
Improving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsImproving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsSAIL_QU
 
Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...SAIL_QU
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...SAIL_QU
 
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...SAIL_QU
 
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...SAIL_QU
 
Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...SAIL_QU
 
Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?SAIL_QU
 
Towards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log ChangesTowards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log ChangesSAIL_QU
 
The Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesThe Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesSAIL_QU
 
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...SAIL_QU
 
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...SAIL_QU
 
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...SAIL_QU
 
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...SAIL_QU
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...SAIL_QU
 
What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?SAIL_QU
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...SAIL_QU
 
Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...SAIL_QU
 
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsMeasuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsSAIL_QU
 

More from SAIL_QU (20)

Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
 
Improving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsImproving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load tests
 
Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...
 
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
 
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
 
Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...
 
Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?
 
Towards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log ChangesTowards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log Changes
 
The Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesThe Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution Analyses
 
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
 
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
 
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
 
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
 
What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
 
Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...
 
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsMeasuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
 

Issre2010 malik

  • 1. Pinpointing the Subsystems Responsible for Performance Deviations In a Load Test 1 Haroon Malik, Bram Adams & Ahmed E. Hassan Software Analysis and Intelligence Lab (SAIL) Queen’s University, Kingston, Canada
  • 2. Large scale systems need to satisfy performance constraints 2
  • 3. TIME SPENT….. 3 ANALYSTS SPENT CONSIDERABLE TIME DEALING WITH PERFORMANCE BUGS
  • 4. 4
  • 5. Environment Setup Load test execution Load test analysis Report generation CURRENT PRACTICE 1 2 3 4 5
  • 6. Environment Setup Load test execution Load test analysis Report generation CURRENT PRACTICE 1 2 3 4 6
  • 7. 2. LOAD TEST EXECUTION MONITORING TOOL LOAD GENERATOR- 1 SYSTEM PERFORMANCE REPOSITORY LOAD GENERATOR- 2 7
  • 8. Environment Setup Load test execution Load test analysis Report generation CURRENT PRACTICE 1 2 3 4 8
  • 9. Environment Setup Load test execution Load test analysis Report generation CURRENT PRACTICE 1 2 3 4 9
  • 11. LARGE NUMBER OF PERFORMANCE COUNTERS 12
  • 14. WE CAN HELP ANALYSTS: Decide if a performance test passed or failed CSMR 2010 Identify the subsystems which violated the performance COMPSAC 2010 Pinpoint the subsystems that are likely the cause of performance violation ISSRE 2010 15
  • 16. METHODOLOGY STEPS 1 2 3 4 DataPreparation CraftingPerformance Signatures Identifying Deviations Pinpointing 17
  • 17. PERFORMANCE COUNTERS ARE HIGHLY CORRELAED CPU DISK (IOPS) NETWORK MEMORY TRANSACTONS/SEC 18
  • 18.  Principal Component Analysis (PCA)  Explains most of the counter data with minimal information loss  Removes the noise in the counter data  Influential Counters Counter Elimination: Norman cut-off criteria  Counter Ranking 2. CRAFTING PERFORMANCE SIGNATURES 19
  • 19. Ide Commits/Sec Writes/Sec CPU Utilization Database Cache % Hit Subsystems Base-Line Load Test - 1 DeviationMatch 0.41 0 0.01
  • 20. PINPOINTING Avg. Pair wise correlation () SUB- A SUB-B SUB-C SUB-D 0.8 0.7 0.9 0.7
  • 21. 4. PINPOINTING SUB- A SUB-B SUB-C SUB-D 0.8 0.7 0.9 0.7 Avg Pair-wise correlation () SUB Load A 0.75 B 0.77 C 0.80 D 0.77 22
  • 22. 4. PINPOINTING SUB- A SUB-B SUB-C SUB-D 0.8 0.7 0.9 0.7 Avg Pair-wise correlation () SUB Load Baseline Dev % A 0.75 0.87 0.12 13.0 B 0.77 0.82 0.05 6.01 C 0.80 0.94 0.14 14.8 D 0.77 0.88 0.11 12.5 23
  • 23. 4. PINPOINTING SUB- A SUB-B SUB-C SUB-D 0.8 0.7 0.9 0.7 Avg Pair-wise correlation () Pinpointed SUB Load Baseline Dev % C 0.80 0.94 0.14 14.8 A 0.75 0.87 0.12 13.0 D 0.77 0.88 0.11 12.5 B 0.77 0.82 0.05 6.01 24
  • 24. 25
  • 25. DELL DVD STORE 26 Components of Test Environment Load GeneratorDatabase Server Web Server (B) Web Server (C) Web Server (A) LoadGenerators Performance Logs Performance Monitoring Tool
  • 26. Base 4-X Dev % Web-1 0.87 0.72 0.15 17.1 Web-2 0.88 0.82 0.05 6.46 Web-3 0.89 0.83 0.06 7.03 DB 0.78 0.73 0.05 6.92 EXPERIMENT-1 4X- LOAD ON WEB-1 Base CPU Dev % Web-1 0.87 0.69 0.18 21.1 Web-2 0.88 0.80 0.08 9.29 Web-3 0.89 0.80 0.09 10.6 DB 0.78 0.73 0.05 7.24 Base CPU Dev % Web-1 0.87 0.83 0.03 4.28 Web-2 0.88 0.83 0.04 5.04 Web-3 0.89 0.84 0.05 6.14 DB 0.78 0.78 0.08 10.4 Base MEM Dev % Web-1 0.87 0.81 0.06 7.42 Web-2 0.88 0.75 0.13 14.9 Web-3 0.89 0.81 0.08 9.49 DB 0.78 0.7 0.087 11.0 EXPERIMENT-2 CPU STRESS ON WEB-1 EXPERIMENT-3 CPU STRESS ON DB EXPERIMENT-4 MEMORY STRESS ON WEB-2 27
  • 28. WHY ARE TEST-D & E DIFFERENT? 29
  • 29. WHY ARE TEST-D & E DIFFERENT?  PINPOINTED – Among 6 subsystems for each test, web server is likely the cause of performance deviation in both tests  Signature counters of web server notably deviated from the baseline: The packet Outbound Discarded, Packet Sent/Sec Message Queue Length Network Problem – No connectivity to the database Web server Under Stress 30
  • 30. LIMITATIONS  Our methodology can only point *A* subsystem that is likely the cause of performance deviation for a load test.  Our methodology can not be generalized to other domains such as network traffic and security monitoring. 31
  • 31. 32

Editor's Notes

  1. So at SAIL lab, we thrive investigating approaches and creating techniques to support practitioners who are producing, maintaining and evolving large scale complex software systems. We aim to develop tools to make the lives of these practitioners more productive, more cheerful and more predictable.
  2. Today's LSS such as Google, eBay ,Facebook and Amazon are composed of many underlying components and subsystems. These LSS are tremendously growing in size to handle growing traffic, complex services and business critical functionality. It is very important to periodically measure the performance of such LLS to satisfy the stakeholders and customers high demands on system quality, availability and responsiveness.
  3. -Load testing is an important weapon in LSS development to uncover functional and performance problems of a system under load Performance of the LSS is calibrated using Load test before it becomes a field or post-deployment problem. -Performance problems include an application not responding fast enough, crashing or hanging under heavy load or not meeting the desired service level agreements (SLA).
  4. Environment-Setup: First but the most important phase of the load testing. As the most common load test failures occurs due to improper environment setup for load tests. The environment setup includes installing the applications and load testing tools on different machines and possibly on different operating systems Load generators, which emulates the users interaction with the systems, need to carefully configured to match the real workload in field. Load Test Execution: It involves starting the components of the system under test, i.e., starting required services, hardware resources and tools ( load generators and performance monitors). Performance counter are recorded in this step too. Load Test Analysis: This step involves comparing the results of a load test against an other load tests results or against predefined thresh holds as baselines. Unlike functional and unit testing, which results in pass of failure classification for each test; load testing requires additional quantitative metrics like response time, throughput and hardware resources utilization to summarize results. The performance analysts selects few of the important performance counters among thousands collected. Based on his experience and domain knowledge performance analyst manually compare the selected performance counters with those of past runs to look for evidence of performance deviations, for example using plots and performing correlations tests. Report Generation: Includes filing the performance deviations, if found, based on the personal judgment of an analysts. Mostly the results produced are verified by an experience analysts. Based on the extent of performance deviation and its relevance to team responsible for handling the subsystems i.e., (database, application, web system etc.)
  5. Environment-Setup: First but the most important phase of the load testing. As the most common load test failures occurs due to improper environment setup for load tests. The environment setup includes installing the applications and load testing tools on different machines and possibly on different operating systems Load generators, which emulates the users interaction with the systems, need to carefully configured to match the real workload in field. Load Test Execution: It involves starting the components of the system under test, i.e., starting required services, hardware resources and tools ( load generators and performance monitors). Performance counter are recorded in this step too. Load Test Analysis: This step involves comparing the results of a load test against an other load tests results or against predefined thresh holds as baselines. Unlike functional and unit testing, which results in pass of failure classification for each test; load testing requires additional quantitative metrics like response time, throughput and hardware resources utilization to summarize results. The performance analysts selects few of the important performance counters among thousands collected. Based on his experience and domain knowledge performance analyst manually compare the selected performance counters with those of past runs to look for evidence of performance deviations, for example using plots and performing correlations tests. Report Generation: Includes filing the performance deviations, if found, based on the personal judgment of an analysts. Mostly the results produced are verified by an experience analysts. Based on the extent of performance deviation and its relevance to team responsible for handling the subsystems i.e., (database, application, web system etc.)
  6. Environment-Setup: First but the most important phase of the load testing. As the most common load test failures occurs due to improper environment setup for load tests. The environment setup includes installing the applications and load testing tools on different machines and possibly on different operating systems Load generators, which emulates the users interaction with the systems, need to carefully configured to match the real workload in field. Load Test Execution: It involves starting the components of the system under test, i.e., starting required services, hardware resources and tools ( load generators and performance monitors). Performance counter are recorded in this step too. Load Test Analysis: This step involves comparing the results of a load test against an other load tests results or against predefined thresh holds as baselines. Unlike functional and unit testing, which results in pass of failure classification for each test; load testing requires additional quantitative metrics like response time, throughput and hardware resources utilization to summarize results. The performance analysts selects few of the important performance counters among thousands collected. Based on his experience and domain knowledge performance analyst manually compare the selected performance counters with those of past runs to look for evidence of performance deviations, for example using plots and performing correlations tests. Report Generation: Includes filing the performance deviations, if found, based on the personal judgment of an analysts. Mostly the results produced are verified by an experience analysts. Based on the extent of performance deviation and its relevance to team responsible for handling the subsystems i.e., (database, application, web system etc.)
  7. Environment-Setup: First but the most important phase of the load testing. As the most common load test failures occurs due to improper environment setup for load tests. The environment setup includes installing the applications and load testing tools on different machines and possibly on different operating systems Load generators, which emulates the users interaction with the systems, need to carefully configured to match the real workload in field. Load Test Execution: It involves starting the components of the system under test, i.e., starting required services, hardware resources and tools ( load generators and performance monitors). Performance counter are recorded in this step too. Load Test Analysis: This step involves comparing the results of a load test against an other load tests results or against predefined thresh holds as baselines. Unlike functional and unit testing, which results in pass of failure classification for each test; load testing requires additional quantitative metrics like response time, throughput and hardware resources utilization to summarize results. The performance analysts selects few of the important performance counters among thousands collected. Based on his experience and domain knowledge performance analyst manually compare the selected performance counters with those of past runs to look for evidence of performance deviations, for example using plots and performing correlations tests. Report Generation: Includes filing the performance deviations, if found, based on the personal judgment of an analysts. Mostly the results produced are verified by an experience analysts. Based on the extent of performance deviation and its relevance to team responsible for handling the subsystems i.e., (database, application, web system etc.)
  8. - Unfortunately, the current practice to analyze load test is costly, time consuming and error prone. This is due to the fact that the load test analysis practices have not kept pace with the rapid growth in size and complexity of the large enterprise systems. In practice, the dominant tools and techniques to analyze large distributed systems have remained unchanged for over twenty years Most of the research had focus on the automatic generation of load testing suits rather then load test analysis - There are many challenges and limitation associated with the current practices of load test analysis that remains unsolved
  9. - Last from computer of hours to several days. They generate performance logs that can be of terra bytes in size Even logging all counters on typical machine at 1Hz generates about 8.6 million values in a single weeks A cluster of 12 machine over a week –13 TB of performance counter data per week. Assuming 64 bit representation for each counter value. Analysis of such large counter log is still a bit challenge in load tests.
  10. Performance analysts in LSS have only limited time to reach and complete diagnostics on performance counter logs and to make necessary configuration changes. Load testing is usually the last step in an already tight and usually delayed release schedule. Hence, managers are always eager to reduce the time allocated for performance testing.
  11. Error prone because of manual process involved in analyzing performance counter data in current practice Impossible for an analyst to skim through large volume of log data, indeed they analyst use few key performance counters know to them from past practices, performance experts and domain trends as ‘ rule of thumbs’. With large scale system that are continuously being evlved by adding new functionalities, applying same rules of thumb can mislead performance issues.
  12. Due to these challenges, we believe the current practice to perform load test analysis is neither effective nor sufficient to uncover performance deviations accurately and in limited time. -
  13. 1) The performance log obtained from a load test do not suffice for direct analysis by our methodology. These logs need to be prepared to make them suitable for statistical techniques employed by our methodology. In this step take care of data sanitization (missing counter variables and incomplete counter variables) and pre-treatment of data such as standardization and data scaling to remove the biasing of variance depended techniques.