Icsm2008 jiang

•Download as PPT, PDF•

0 likes•165 views

SAIL_QU

What Is a Load Test?
■ A load test
– Mimics multiple users performing tasks at the same
time (field simulation)
– Lasts for several hours or a few days
– For example, load test an online bookstore to see if
the site can handle 1,000 users

Load Testing Challenges
■ No documented behavior
■ Time pressure
– Lasts for several hours or longer
– Final step in the development cycle
■ Monitoring overhead
– Profiling or extra instrumentation is not
recommended
■ Large volume of data

Current Practice
■ Crash check
– Restart, crash, hung?
■ Performance check
– Memory, disk, CPU, network usage
– Is there a memory leak?
■ Basic error check
– Grep for keywords like “failure” or “error”, etc.
Not
Sufficient!

Problems with Current Practice
■ Labour intensive and time consuming
– Large volumes of generated data
■ Not all “error” or “fail” is important
– “Failure to locate item in the cache”
■ Not all errors contain the term “error” or
“fail”
– “Message buffer limit is reached”

Our Approach
■ Intuition: Load testing involves the
execution of the same operations over a
large number of times
■ Most large enterprise applications have logging
enabled for:
– Remove issue resolution
– Cope with legal acts like “Sarbanes-Oxley Act”
■ Our Approach: Automatically discover runtime
anomalies by mining the execution logs

Anomaly Detection for a Load Test
■ (E2, E3) always follow each other:
– (acquire_lock, release_lock)
– (open_inbox, close_inbox)
■ If we see (E2, E6) this might be a problem
E1 E2 E3 E4
E1 E2 E3 E4
E1 E2 E3 E4
E1 E2 E6 E4

Anomaly Detection Framework
Execution
Logs
Log
Decomposition
Log
Abstraction
Dominant
Behavior
Identification
Anomaly
Detection
Anomaly
Report

Step 3. Dominant Behavior
Identification
■ Execute-After Relation, (E1, *)
– E1 and E2 belong to the same group, and
– E2 is the next event that directly follows E1

Step 4. Anomaly Detection
■ z-stats highlights the differences between the
dominant behavior and the deviated behavior
■ The higher z-stats, the higher the contrasts, and
the more likely it statistically holds

Load Testing Problems
■ Bugs in the application under test
■ Problems with the load environment
– Mis-configuration
– Hardware failures
– Software Interactions
■ Problems with the load generation
– Incorrect use of load generation tools
– Buggy load generators

App 1 Anomaly Example
■ Application Problems:
– 54 out of 33,000 (<0.2%) event pairs indicate
an application problem with items being
dropped from the queue. Error message did
not have the word "error" in it.
■ Environment Problems:
– 4 times out of 33,000 (≈0.01%) event pairs
indicate an environment problems

Discussions and Limitations
■ Is the dominant behavior really the correct
behavior?
– E.g. Hardware failure
■ Process logs for the whole load test all at
once
■ False positives
– Due to the nature of the load
– Thread switches

Conclusions
Challenges of Load Testing
■ No documented behavior
■ Time pressure
– Lasts for several hours or longer
– Last step in the development cycle
■ Monitoring overhead
– Profiling or extra instrumentation is not
recommended
■ Large volumes of data
Problems with Current Practice
■ Labour intensive and time consuming
– Large volumes of generated data
■ Not all “error” or “fail” is important
– “Failure to locate item in the cache”
■ Not all errors contain the term “error” or
“fail”
– “Message buffer limit has reached”
Case Studies

Similar to Icsm2008 jiang

Introduction to Software TestingHenry Muccini

Nfr testing(performance)Dilip Sharma

Performance Analysis of Idle Programsgreenwop

Resolving problems & high availabilityZend by Rogue Wave Software

Software testingssusere50573

Testingfor Sw Securityankitmehta21

July webinar l How to Handle the Holiday Retail Rush with Agile Performance T...Apica

Apica - Performance Does Matter: Five Key Elements to Consider in the CloudRightScale

Performance TestingAnu Shaji

Nonfunctional Testing: Examine the Other Side of the CoinTechWell

Five Common Mistakes made when Conducting a Software FMECAAnn Marie Neufelder

Gatling - Bordeaux JUGslandelle

Performance and Success: Key Elements to Consider in the CloudRightScale

Holiday Readiness: Best Practices for Successful Holiday Readiness TestingApica

19-reliabilitytesting.pptAnilteaser

Monitoring & alerting presentation sabin&mustafaLama K Banna

ccna course 2S Sridhar

Tc Checklistnazeer pasha

Software Testing Basicssachinmistry786

Skillful scalefull fullstack security in a state of constant fluxEoin Keary

Similar to Icsm2008 jiang (20)

Introduction to Software Testing

Nfr testing(performance)

Performance Analysis of Idle Programs

Resolving problems & high availability

Software testing

Testingfor Sw Security

July webinar l How to Handle the Holiday Retail Rush with Agile Performance T...

Apica - Performance Does Matter: Five Key Elements to Consider in the Cloud

Performance Testing

Nonfunctional Testing: Examine the Other Side of the Coin

Five Common Mistakes made when Conducting a Software FMECA

Gatling - Bordeaux JUG

Performance and Success: Key Elements to Consider in the Cloud

Holiday Readiness: Best Practices for Successful Holiday Readiness Testing

19-reliabilitytesting.ppt

Monitoring & alerting presentation sabin&mustafa

ccna course 2

Tc Checklist

Software Testing Basics

Skillful scalefull fullstack security in a state of constant flux

More from SAIL_QU

Studying the Integration Practices and the Evolution of Ad Libraries in the G...SAIL_QU

Studying the Dialogue Between Users and Developers of Free Apps in the Google...SAIL_QU

Improving the testing efficiency of selenium-based load testsSAIL_QU

Studying User-Developer Interactions Through the Distribution and Reviewing M...SAIL_QU

Studying online distribution platforms for games through the mining of data f...SAIL_QU

Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...SAIL_QU

Investigating the Challenges in Selenium Usage and Improving the Testing Effi...SAIL_QU

Mining Development Knowledge to Understand and Support Software Logging Pract...SAIL_QU

Which Log Level Should Developers Choose For a New Logging Statement?SAIL_QU

Towards Just-in-Time Suggestions for Log ChangesSAIL_QU

The Impact of Task Granularity on Co-evolution AnalysesSAIL_QU

A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...SAIL_QU

How are Discussions Associated with Bug Reworking? An Empirical Study on Open...SAIL_QU

A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...SAIL_QU

A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...SAIL_QU

Studying the Dialogue Between Users and Developers of Free Apps in the Google...SAIL_QU

What Do Programmers Know about Software Energy Consumption?SAIL_QU

Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...SAIL_QU

Revisiting the Experimental Design Choices for Approaches for the Automated R...SAIL_QU

Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsSAIL_QU

More from SAIL_QU (20)

Studying the Integration Practices and the Evolution of Ad Libraries in the G...

Studying the Dialogue Between Users and Developers of Free Apps in the Google...

Improving the testing efficiency of selenium-based load tests

Studying User-Developer Interactions Through the Distribution and Reviewing M...

Studying online distribution platforms for games through the mining of data f...

Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...

Investigating the Challenges in Selenium Usage and Improving the Testing Effi...

Mining Development Knowledge to Understand and Support Software Logging Pract...

Which Log Level Should Developers Choose For a New Logging Statement?

Towards Just-in-Time Suggestions for Log Changes

The Impact of Task Granularity on Co-evolution Analyses

A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...

How are Discussions Associated with Bug Reworking? An Empirical Study on Open...

A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...

A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...

Studying the Dialogue Between Users and Developers of Free Apps in the Google...

What Do Programmers Know about Software Energy Consumption?

Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...

Revisiting the Experimental Design Choices for Approaches for the Automated R...

Measuring Program Comprehension: A Large-Scale Field Study with Professionals

Icsm2008 jiang

1. Automated Identification of Load Testing Problems Zhen Ming (Jack) Jiang, Ahmed E. Hassan Software Analysis and Intelligence Lab (SAIL), Queen’s University, Canada Gilbert Hamann, Parminder Flora Enterprise Performance Engineering, Research In Motion (RIM), Canada

2. What Is a Load Test? ■ A load test – Mimics multiple users performing tasks at the same time (field simulation) – Lasts for several hours or a few days – For example, load test an online bookstore to see if the site can handle 1,000 users

3. Load Testing Challenges ■ No documented behavior ■ Time pressure – Lasts for several hours or longer – Final step in the development cycle ■ Monitoring overhead – Profiling or extra instrumentation is not recommended ■ Large volume of data

4. Current Practice ■ Crash check – Restart, crash, hung? ■ Performance check – Memory, disk, CPU, network usage – Is there a memory leak? ■ Basic error check – Grep for keywords like “failure” or “error”, etc. Not Sufficient!

5. Problems with Current Practice ■ Labour intensive and time consuming – Large volumes of generated data ■ Not all “error” or “fail” is important – “Failure to locate item in the cache” ■ Not all errors contain the term “error” or “fail” – “Message buffer limit is reached”

6. Our Approach ■ Intuition: Load testing involves the execution of the same operations over a large number of times ■ Most large enterprise applications have logging enabled for: – Remove issue resolution – Cope with legal acts like “Sarbanes-Oxley Act” ■ Our Approach: Automatically discover runtime anomalies by mining the execution logs

7. Anomaly Detection for a Load Test ■ (E2, E3) always follow each other: – (acquire_lock, release_lock) – (open_inbox, close_inbox) ■ If we see (E2, E6) this might be a problem E1 E2 E3 E4 E1 E2 E3 E4 E1 E2 E3 E4 E1 E2 E6 E4

8. Dell DVD Store (DS2)

9. Anomaly Detection Framework Execution Logs Log Decomposition Log Abstraction Dominant Behavior Identification Anomaly Detection Anomaly Report

10. Step 1. Log Decomposition

11. Step 2. Log Abstraction

12. Step 3. Dominant Behavior Identification ■ Execute-After Relation, (E1, *) – E1 and E2 belong to the same group, and – E2 is the next event that directly follows E1

13. Step 4. Anomaly Detection ■ z-stats highlights the differences between the dominant behavior and the deviated behavior ■ The higher z-stats, the higher the contrasts, and the more likely it statistically holds

14. Step 5. Anomaly Report

15. Load Testing Problems ■ Bugs in the application under test ■ Problems with the load environment – Mis-configuration – Hardware failures – Software Interactions ■ Problems with the load generation – Incorrect use of load generation tools – Buggy load generators

16. Case Studies

17. Dell DVD Store (DS2)

18. App 1 Anomaly Example ■ Application Problems: – 54 out of 33,000 (<0.2%) event pairs indicate an application problem with items being dropped from the queue. Error message did not have the word "error" in it. ■ Environment Problems: – 4 times out of 33,000 (≈0.01%) event pairs indicate an environment problems

19. Discussions and Limitations ■ Is the dominant behavior really the correct behavior? – E.g. Hardware failure ■ Process logs for the whole load test all at once ■ False positives – Due to the nature of the load – Thread switches

20. Conclusions Challenges of Load Testing ■ No documented behavior ■ Time pressure – Lasts for several hours or longer – Last step in the development cycle ■ Monitoring overhead – Profiling or extra instrumentation is not recommended ■ Large volumes of data Problems with Current Practice ■ Labour intensive and time consuming – Large volumes of generated data ■ Not all “error” or “fail” is important – “Failure to locate item in the cache” ■ Not all errors contain the term “error” or “fail” – “Message buffer limit has reached” Case Studies

Editor's Notes

file:///C:/Users/jiangzhenming/Documents/icsm2008/AnomalyReports/RIM/AnomalyReport-LT5BES1-MAGT-01-20071110-0001.html

Icsm2008 jiang

Recommended

Recommended

More Related Content

Similar to Icsm2008 jiang

Similar to Icsm2008 jiang (20)

More from SAIL_QU

More from SAIL_QU (20)

Icsm2008 jiang

Editor's Notes