Measuring Program Comprehension: A Large-Scale Field Study with Professionals
Scam2011 syer
1. Identifying Performance
Deviations in Thread Pools
Mark D. Syer, Bram Adams and Ahmed E. Hassan
mdsyer@cs.queensu.ca
Software Analysis and Intelligence Lab
Queen’s University, Canada
1
4. Potential solutions must account for…
• Huge amounts of data
• Limited system knowledge
4
Potential solutions should…
• Limit manual review/analysis
5. 5
4
How to detect
performance
deviations?
Hundreds of machines
Thousands of threads
Needle in a haystack!
Complex
architectures
Simulate user actions
Monitor thread behaviour by
collecting resource usage metrics
6. Simulate user actions
Monitor thread behaviour by
collecting resource usage metrics
6
43
21
Group threads depending on their machine ID
0. We increasing rely on ultra large scale software systems for e-commerce, personal and professional networking and telecommunications.
1.
2.
0. We increasing rely on ultra large scale software systems for e-commerce, personal and professional networking and telecommunications.
1. Complex system
2. How to detect performance deviations? Needle in a haystack
Collect metrics for processes
1. Load test the system by simulating user actions
2. Collect resource usage metrics (e.g., cpu usage and memory allocation) for each thread
Collect metrics for processes
0. Abstract metrics, group all the processes by which machine they are running on
1. Plot resource usage metrics
Identify and group similar behaviour
1. Group 1 and 2
2. Group 1,2 and 3
3. Group 1,2,3 and 4
4. Rank
But, this still leaves us comparing machines, we want to find the processes that are deviating. We can use our ranking to look as just the processes running on machine 4.
Methodology
1. In general, you must decide how to group processes. In our previous example we grouped by machines, but how grouping is done more generally is open to analysts
Top Down
1. In the previous example we analyzed at the machine level, but still don’t know which threads are deviating
2. Use the machine level ranking to inform our selection of threads, and analyze the threads of the deviating machine
We have about 100 hours of load test data
As opposed to grouping by space (i.e., machines) we group by time
1. Group by machines
2. Group by time
2a. Top level become time (in our previous example we compared the behaviour of machines, now we compare the behaviour of threads from different times)
3.
4. RQ1
5. RQ2
Quantitative Evaluation – Defect Injection
The types of deviations that were injected was based on previous research and consultations with system experts