• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Predictive Performance Testing: Integrating Statistical Tests into Agile Development Life-cycles

Predictive Performance Testing: Integrating Statistical Tests into Agile Development Life-cycles



This presentation was delivered by Tom Kleingarn at HP Software Universe 2010 in Washington DC. It describes basic statistical tests that can be applied to any performance engineering practice to ...

This presentation was delivered by Tom Kleingarn at HP Software Universe 2010 in Washington DC. It describes basic statistical tests that can be applied to any performance engineering practice to improve accuracy and confidence in your test results.



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • The t-statistic was introduced in 1908 by William Sealy Gosset, a chemist working for the Guinness brewery in Dublin, Ireland ("Student" was his pen name).[1][2][3] Gosset had been hired due to Claude Guinness&apos;s innovative policy of recruiting the best graduates from Oxford and Cambridge to apply biochemistry and statistics to Guinness&apos; industrial processes.[2] Gosset devised the t-test as a way to cheaply monitor the quality of stout. He published the test in Biometrika in 1908, but was forced to use a pen name by his employer, who regarded the fact that they were using statistics as a trade secret. In fact, Gosset&apos;s identity was unknown to fellow statisticians. <br />

Predictive Performance Testing: Integrating Statistical Tests into Agile Development Life-cycles Predictive Performance Testing: Integrating Statistical Tests into Agile Development Life-cycles Presentation Transcript

  • Predictive Performance Testing Integrating Statistical Tests into Agile Development Lifecycles Tom Kleingarn Lead, Performance Engineering Digital River http://www.linkedin.com/in/tomkleingarn http://www.perftom.com ©2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice
  • Agenda > Introduction > Performance engineering > Agile > Outputs from LoadRunner > Basic statistics > Advanced statistics > Summary > Practical application
  • About Me > Tom Kleingarn > Lead, Performance Engineering - Digital River > 4 years in performance engineering > Tested over 100 systems/applications > 100’s of performance tests > Tools > LoadRunner > JMeter > Webmetrics, Keynote, Gomez > ‘R’ and Excel > Quality Center > QuickTest Professional
  • > Leading provider of global e-commerce solutions > Builds and manages online businesses for software and game publishers, consumer electronics manufacturers, distributors, online retailers and affiliates. > Comprehensive platform offers > > > > > > > > Site development and hosting Order management Fraud management Export control Tax management Physical and digital product fulfillment Multi-lingual customer service Advanced reporting and strategic marketing
  • Performance Engineering > The process of experimental design, test execution, and results analysis, utilized to validate system performance as part of the Software Development Lifecycle (SDLC). > Performance requirements – measureable targets of speed, reliability, and/or capacity used in performance validation. > Latency < 10ms, measured at the 99th percentile > 99.95% uptime > Throughput of 1,000 requests per second
  • Performance Testing Cycle 1. Requirements Analysis 2. Create test plan 3. Create automated scripts 4. Define workload model 5. Execute scenarios 6. Analyze results > Rinse and repeat if… > Defects identified > Change in requirements > Setup or environment issues > Performance requirement not met Digital River Test Automation
  • Agile > A software development paradigm that emphasizes rapid process cycles, cross-functional teams, frequent examination of progress, and adaptability. Initial Plan Scrum Deploy
  • Agile Performance Engineering > Clear and constant communication > Involvement in initial requirements and design phase > Identify key business processes before they are built > Coordinate with analysts and development to build key business processes first > Integrate load generation requirements into project schedule > Test immediately with v1.0 > Schedule tests to auto-start, run independently > Identify invalid test results before deep analysis
  • LoadRunner Results > Measures of central tendency > Average = ∑(all samples)/(sample size) = > Median = 50th percentile > Mode – highest frequency, the value that occurred the most > Measures of variability > Min, max > Standard Deviation = > 90th percentile
  • LoadRunner Results 90% 50% 50% 10%
  • Basic Statistics – Sample vs. Population > Performance requirement: average latency < 3 seconds > What if you ran 50 rounds? 100 rounds?
  • Basic Statistics – Sample vs. Population > Sample – set of values, subset of population > Population – all potentially observable values > Measurements > Statistic – the estimated value from a collection of samples > Parameter – the “true” value you are attempting to estimate Not a representative sample!
  • Basic Statistics – Sample vs. Population > Sampling distribution – the probability distribution of a given statistic based on a random sample of size n > Dependent on the underlying population > How do you know the system under test met the performance requirement?
  • Basic Statistics – Normal Distribution > With larger samples, data tend to cluster around the mean
  • Basic Statistics – Normal Distribution Sir Francis Galton’s “Bean Machine”
  • Confidence Intervals > The probability that an interval made up of two endpoints will contain the true mean parameter μ > 95% confidence interval: > … where 1.96 is a score from the normal distribution associated with 95% probability:
  • Confidence Intervals > In repeated rounds of testing, a confidence interval will contain the true mean parameter with a certain probability: True Average
  • Confidence Intervals in Excel Statistic Value 95% Value 99% Formula Average 3.40 3.40 Standard Deviation 1.45 1.45 Sample size 500 500 Confidence Level 0.95 0.99 Significance Level 0.05 0.01 0.0127 0.167 =CONFIDENCE(Sig. Level, Std Dev, Sample Size) Lower Bound 3.273 3.233 =Average - Margin of Error Upper Bound 3.527 3.567 =Average + Margin of Error Margin of Error =1-(Confidence Level) > 95% confidence - true average latency 3.273 to 3.527 seconds > 99% confidence - true average latency 3.233 to 3.567 seconds > Our range is wider at 99% compared to 95%, 0.334 sec vs. 0.254 sec
  • The T-test > Test that your sample mean is greater than/less than a certain value > Performance requirement: Mean latency < 3 seconds > Null hypothesis: Mean latency >= 3 seconds > Alternative hypothesis: Mean latency is < 3 seconds Add pic
  • T-test – Raw Data from LoadRunner n = 500
  • T-test in ‘R’ > ‘R’ for statistical analysis > http://www.r-project.org/ Load test data from a file: > datafile <- read.table("C:Datatest.data", header = FALSE, col.names= c("latency")) Attach the dataframe: > attach(datafile) Create a “vector” from the dataframe: > latency <- datafile$latency
  • T.Test in ‘R’ > t.test(latency, alternative="less", mu=3, tails=1) One Sample t-test data: latency t = -2.9968, df = 499, p-value = 0.001432 alternative hypothesis: true mean is less than 3 > There is a 0.14% probability that the true average latency of the system is greater than 3 seconds. In this case we would reject the null hypothesis. > There is a 99.86% probability that the true average latency is less than 3 seconds
  • T-test – Number of Samples Required > power.t.test(sd=sd, sig.level=0.05, power=0.90, delta=mean(latency)*0.01, type="one.sample") One-sample t test power calculation n = 215.5319 delta = 0.03241267 sd = 0.1461401 sig.level = 0.05 power = 0.9 alternative = two.sided > We need at least 216 samples > Our sample size is 500, we have enough samples to proceed
  • Test for Normality > Test that the data is “normal” > Clustered around a central value, no outliers > Roughly fits the normal distribution > shapiro.test(latency) Shapiro-Wilk normality test data: latency p-value = 0.8943 > Our sample distribution is approximately normal > p-value < 0.05 indicates the distribution is not normal
  • Review > Sample vs. Population > Normal distribution > Confidence intervals > T-test > Sample size > Test for normality > Practical application > Performance requirements > Compare two code builds > Compare system infrastructure changes
  • Case Study > Engaged in a new web service project > Average latency < 25ms > Applied statistical analysis > System did not meet requirement > Identified problem transaction > Development fix applied > Additional test, requirement met > Prevented a failure in production
  • Implementation in Agile Projects > Involvement in early design stages > Identify performance requirements > Build key business processes first > Calculate required sample size > Apply statistical analysis > Run fewer tests with greater confidence in your results > Prevent performance defects from entering production > Prevent SLA violations in production