• Save
Analyzing Performance Test Data
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Analyzing Performance Test Data

on

  • 2,217 views

Carles Roch-Cunill, Test Lead for System Performance at McKesson Medical Imaging Group, shared his expertise in Analyzing Performance Test Data.

Carles Roch-Cunill, Test Lead for System Performance at McKesson Medical Imaging Group, shared his expertise in Analyzing Performance Test Data.

Statistics

Views

Total Views
2,217
Views on SlideShare
2,197
Embed Views
20

Actions

Likes
1
Downloads
0
Comments
0

2 Embeds 20

http://www.optimusinfo.com 19
http://localhost 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Remind that usually performance requirements fall into the categories of non-functional requirements. More often than not are done at the integration level, where multiple steps are required. This talk will only performance parameters of your system. We will not focus in complementary values like OS performance indicators
  • Here we are under the assumption that we have some Performance problems, or at least the possibility of performance degradation However, event if the test case does not fail, you may be curious to know where the improvement (if any) in performance occur. The “But there may be others…” it by itself another hypothesis!
  • We can introduce here the concept of percentile In a normal distribution, “event A should not take more than X second 50% of the time” is equivalent to the “Event A average should be X”
  • There are sophisticated mathematical formulas to evaluate the statistical significance of a batch of data, but in everyday life there is not enough time to gather the appropriate amount of data and the time to analyze it and apply these equations.
  • Based on the same criteria, we can argue that results for test 2 also are not statistically equivalent.
  • Now that we have some statistical concepts we can analyze the data.
  • This is non-uniform distribution
  • Usually, when you are measuring data, you have some model you are explicitly or implicitly using. The most common one, both explicit and implicit is the normal distribution.
  • Models are useful to idealize reality and to make sense of your data. But your model can be wrong or incomplete. For example you may implicitly assume you are working with a real time system, but software applications running in Windows or Unix are not real time systems.
  • “The analysis of the performance data…” You can observer trends. For example in the “Time to query” graph, if we increase the number of queries that join a large number of tables, then our performance will degrade. With this in mind we can optimize the database or change the queries. “The analyzed results…” this is because computer systems are not linear. For example we can predict the behaviour of a system versus the network utilization, but it will be a point in which the network will became unusable.

Analyzing Performance Test Data Presentation Transcript

  • 1.  
  • 2. Analyzing Performance test data (or how to convert your numbers to information) Carles Roch-Cunill Test Lead for System Performance McKesson Medical Imaging Group [email_address]
  • 3. Agenda
    • Performance testing as an experimental activity
    • Very fast review of Scientific Method
    • Errors, forget them at your own risk
    • About the meaning of data
    • Some statistical concepts
    • Analyzing data
    • Adjusting your data to a model
    • Summary
  • 4. Performance testing as an experimental activity
    • There are two approaches to testing:
      • a) Without added value
        • This feature does not work
        • This requirement is not meet
      • b) With added value
        • This feature does not work, and this module/component/software artifact is the culprit
        • This requirement is not meet, and it fails for this reason.
    • Usually, things are not so clear, and testers statements fall somehow in the middle.
    • Because Performance testing gathers data that can be analyzed, the performance tester is well positioned to provide added value information to the team.
  • 5. Performance testing as an experimental activity
    • If you want to provide added value and explain why the requirement is not met you will
    • Formulate a hypothesis: “My performance degrades due to component X”
    • Test the hypothesis by developing an appropriate test environment
    • Gather results
    • Analyze the results to see if they confirm or reject your hypothesis
    • If you are lucky and your guess (the hypothesis) was good, you will have explained at least a part of the performance behaviour.
    • However, usually there may be other factors that may also influence your performance, so you have catch one low hanging fruit.
  • 6. Performance testing as an experimental activity
    • You can create different test that will put more emphasis in one of the components of the system.
    • For example, you may want to specifically measure the performance of the data repository tier, or the network, or only the UI.
    • Depending where is your focus, your methodology and your tools will change.
    • In all cases, you need to fix all the parameters but one. For example, if you want to study the influence of the network on your system, you need to do the following:
      • Determine the parameters that characterize the network (latency, bandwidth, utilization…)
      • Identify if they are independent or not (utilization and latency may not be independent)
      • Modify one parameter at a time while keeping the other constant
  • 7. Very fast review of Scientific Method
    • An effect has been observed. Example: performance degradation on your application
    • You try to reproduce it and learn the conditions to reproduce it at will
    • You may gather some data through testing
    • To explain the data you formulate a model (hypothesis)
    • You refine your testing and tailor it around your model
    • You analyze the new data and check if your model fits the data
    • If the model fits it, you are on a good footing
    • If the model partially fits it, you either refine your model or discard it.
    • If the model does not fits it, you formulate another model
    • In both cases, new data obtained from other tests may force you to modify/rethink or even dump your model.
    • Once your data fits the model, you draw conclusions based on the framework provided by the model.
  • 8. Very fast review of Scientific Method
    • Unstated principles:
    • Simpler is better
    • Same procedure and system, you get the same results.
    • A model should not introduce mode questions than it answers
    • Usually, newer models include the older models as particular cases
    • Models are dynamic.
  • 9. Errors, forget them at your own risk
    • Errors happen… so take them into account
    • There are two main kind of errors:
      • Human Errors: stopping the watch in the wrong moment, confusing digits…
      • Instrument error: Your watch is not precise, has a mechanical defect…
  • 10. Errors, forget them at your own risk In the graph besides. If your error bar is ± 1, we can say the trend is to a larger value. However, if the error bar is ± 3, then we can not say anything about the trend of this data
  • 11. About the meaning of data
    • Performance generates a lot of data. But what all the data means? To explain this data you need to take into account:
      • Hardware
      • Network characteristics
      • Network topology
      • Physical support for Data tier (storage, database..)
      • The architecture of your application
      • How your application is coded
    • … .
  • 12. About the meaning of data
    • In addition, you need to analyze the results in the context of the requirement or the question you are trying to answer.
    • For example:
        • “ Event A should not take more than x seconds”
    • In most of the circumstances involving computer systems, you will have an stochastic component in your distribution. Assuming a normal one you will have something like
  • 13. About the meaning of data But, what exactly the requirement means? Strictly it means:
  • 14. About the meaning of data
    • However, the requirement it usually interpreted as :
    For formal point of view the requirement “Event A should not take more than x seconds” would have failed with the above distribution. However the statement “The average of Event A should not take more than x seconds” would pass
  • 15. About the meaning of data
    • The requirement can also be expressed as percentile
    In this case the requirement will be stated as “Event A should not take more than X seconds 50% of the time”
  • 16. Some statistical concepts
    • Once we have defined the question, we can provide the answer. The answer will be obtained through measurements (either manual or automated).
    • The more measurements you take, the better will be your statistics and the better will be your answers.
    • However, the measurements need to be statistically significant . What it means is the measurement is good enough to be included in your statistics.
    • All the measurements that are included in your statistics need to be statistically equivalent
  • 17. Some statistical concepts
    • How you determine if your data is statistically equivalent?
    • You can apply some complex mathematical analysis or apply common sense.
    • Some rules of thumb:
      • If in a single set of measurements, 20% of your data is very different, you either have a problem in your test system or you are observing different phenomena.
      • If you have done several runs, and the 90th percentile of a new test is bigger (smaller) than the maximum (minimum) of the previous tests, then the new data is not statistically similar, and has no statistically significance for your results.
      • If you are expecting a specific distribution, and you are not getting it, the current set can not be compared (is not statistically equivalent) to the data you were expecting.
      • Outliers are not statistically equivalent to the rest of the set.
  • 18. Some statistical concepts
      • Example of 90 th percentile for Test 3 being bigger than the maximum of the other sets of measurements. In this context Test 3 is not statistically equivalent and will be rejected.
  • 19. Some statistical concepts
    • Outliers are usually defined as
      • Measurement outside the overall pattern of a distribution (Moore and McCabe 1999).
      • A more precise definition is a point the is 1.5 more than the interquartile range above the third quartile of below the first quartile
    • Usually, the presence of an outlier indicates either an error in the measurement or an incomplete model
  • 20. Analyzing data
    • While testing a non deterministic system you will always get a distribution of values, all of them valid in principle.
    • For example, if your average in a measure is 3 and you sample again and get 6, this ‘6’ is also correct and you can not discard this number ( unless you do not determine this point is an outlier ).
    • The good news is you can extract information from this succession of different numbers.
  • 21. Analyzing data
    • For example, we may have the following collection of raw data for a measure that generically we will describe as “query database”, in seconds 4.18; 2.1; 1.9; 2.23; 4.5; 4.2; 2.19; 2.21; 4.24; 2.23; 1.99; 2.01; 2.39; 4.19; 2.42; 2.08; 2.27; 3.98; 2.21; 2.45; 4.32;  average: 2.9 These results seem to be a mix of two series: 2.1; 1.9; 2.23; 2.19; 2.21; 2.23; 1.99; 2.01; 2.39; 2.42; 2.08; 2.27; 2.21; 2.45  average: 2.2 And 4.18; 4.24; 4.19; 3.98; 4.32; 4.5; 4.2  average: 4.2
  • 22. Analyzing data
    • What the previous slide is telling us?
    • Averaging all the results tells us nothing.
    • The results point to a hidden effect: the system executes the query in different ways.
    • One possible cause could be that one query joints more tables and thus, it takes more time to return the results
    • So, if you want to answer the question of “What is the time to execute this query” you would need to be more nuanced or would need to know the frequency of these queries, so you would be able to make a weighted average.
  • 23. Adjusting your data to a model
    • The most common one is the usual Gaussian or normal distribution, where σ is the standard deviation and μ is the average
    The importance of this distribution lay in the Central Limit Theorem, that indicates the distribution of random variables tend to be a normal distribution when sampled a large number of times. Example: if we assume that latency experience by users in a wireless network only depend on the distance to the hub, μ can be interpreted as the average distance of the user to the hub and σ will indicate how spread are the users around the hub.
  • 24. Adjusting your data to a model
    • Another example of analysis:
    • The Chi distribution
    Resembles in first approximation to the Gaussian distribution, however, it refers when a phenomena depends of K independent parameters, and each of them individually would provide a Gaussian distribution. Example: the observed latency time in a ADSL city wide network may depend of the network utilization, and the latency induced by the distance to the nearest hub. If we want to improve the performance of the system, then we need to tackle both problems.
  • 25. Adjusting your data to a model
    • This would be an example of two uniform distributions
  • 26. Adjusting your data to a model
    • If your model can not explain well the results, you need to change or improve the model
    • A useful model should have predictive capabilities, so you can design new tests to prove/disprove the model.
    • Negative results (model disproved) can be as useful as a positive results
    • The analysis of the performance data can help to prevent future bottlenecks and problems
    • The analyzed results will have a range of validity. Do not force too many consequences from them
  • 27. Summary
    • Performance testers provide information beyond requirement compliance
    • Performance testing should be treated like a experimental activity
    • As experimental activity, scientific method is the most appropriate method of enquiry.
    • In tune with the scientific method, you need to make assumptions, design your experiment accordingly and reduce the error bars
    • Data should be subject to an statistical analysis
    • After the analysis, you should try explain your data with a model
    • If the models does not a good job explaining your data, you should change/refine the model
    • Your analysis should help to make the software better.
  • 28. Analyzing Performance test data
    • Questions?