Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Software Defect Repair Times: A Multiplicative Model


Published on

Software Defect Repair Times: A Multiplicative Model - PROMISE 2008

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Software Defect Repair Times: A Multiplicative Model

  1. 1. Software Defect Repair Times: A Multiplicative Model Robert Mullen Cisco Systems Boxborough MA bomullen @ Swapna S. Gokhale Univ. of Connecticut Storrs CT [email_address]
  2. 2. Outline <ul><li>The need for timely, correct fixes, and tracking. </li></ul><ul><li>Two approaches, MTTR and Age; Tradeoffs. </li></ul><ul><li>Log-transform of data, form of the distribution </li></ul><ul><li>Multiplicative factors  lognormal </li></ul><ul><li>Transformation from rates to age </li></ul><ul><li>Comparison of models </li></ul><ul><li>Implications for management </li></ul>
  3. 3. Problem definition <ul><li>Our problem was to characterize and improve software defect repair times order to improve both reliability of released networking products and time-to-market of products under development. </li></ul><ul><li>Repair time is from date defect record is created until defect is repaired in at least one version. </li></ul><ul><li>Both interval before defect is recorded and interval until fix is distributed are not included. </li></ul>
  4. 4. One approach: Mean Time To Repair, MTTR ( Not today ! ) <ul><li>Little’s Law: </li></ul><ul><ul><ul><li>average wait time = queue length / service rate </li></ul></ul></ul><ul><li>Similar to days accounts receivable or days of inventory; well understood by management and goaled at Cisco </li></ul><ul><li>Both unfixed and recent fixes affect the result </li></ul><ul><li>Integrate both queue length and service rate over 90 days </li></ul><ul><li>Ordinarily track all dispositions, not just fixes </li></ul><ul><li>Suitable for comparing products, teams, etc. </li></ul><ul><li>Retrospective trending can be done using on historical data </li></ul>
  5. 5. Second approach: Measuring age at fix <ul><li>Closed bugs: age is interval from creation to fix </li></ul><ul><li>Open bugs: age is from creation to present </li></ul><ul><ul><ul><li>Not studied here; distribution may differ from Closed. </li></ul></ul></ul><ul><li>Average age of open or average age of closed can be erratic if there are outliers </li></ul><ul><li>Controlling variability depends on preventing outliers. </li></ul><ul><li>Data collection: pick a product and a range of time during which > 1000 defects were fixed. Determine the age of each defect at the time it was fixed. </li></ul><ul><li>We included only defects for which there was a fix, not other dispositions. </li></ul>
  6. 6. Comparison of MTTR and Age The chart represents the methods, as practiced. Improvements to either method might remove their weaknesses. Today’s presentation uses the Age perspective. For MTTR perspective see Gokhale/Mullen, ISSRE-2006. exceptions numbers Manage By descriptive analytic Tools present trend Time Scale outliers average Focus Age Distribution MTTR
  7. 7. One year, Severities 1-3, Linear plot <ul><li>Very skewed distribution </li></ul><ul><li>Median 37 days </li></ul><ul><li>Mean 81 </li></ul><ul><li>Std. dev. 147 </li></ul><ul><li>85%-ile 139 </li></ul>
  8. 8. One year, Severities 1-3, Log plot <ul><li>Same chart but Log scale. </li></ul><ul><li>Log chart shows distinct S curve </li></ul><ul><li>Lower counts for S1 yield relatively greater fluctuations. </li></ul><ul><li>Severe bugs (S1, S2) get faster service except for tail. </li></ul>
  9. 9. Lognormal provides excellent fit <ul><li>N >> 1000. In this case Age at Fix is visually identical with the Lognormal. </li></ul><ul><li>The lognormal is the most commonly used distribution in maintainability analysis because it is considered representative of the distribution of most repair times. MIL-HDBK-470A </li></ul><ul><li>Note for later --- fitted lognormal is slightly lower at the left edge </li></ul>
  10. 10. Relationship between the mean and variance of the Log(age) and of the age itself <ul><li>Mean (Log(age)) =  </li></ul><ul><li>Variance (Log(age)) =   </li></ul><ul><li>Median (Log(age)) =  </li></ul><ul><li>Median (age) = exp (  ) </li></ul><ul><li>Mean (age) = exp (  +    ) </li></ul><ul><li>Variance (age) </li></ul><ul><li>=exp(2  +   ) (exp(   ) -1) </li></ul>Example Values 3.5 3.0 2.5 3.0 3.0 3.0 3.46 3.30 3.17 2.35 2.26 2.15  1.6 1.6 1.6 1.7 1.6 1.5 1.47 1.50 1.52 1.66 1.69 1.70  250 72 151 44 180 62 250 72 351 85 411 119 147 81 140 73 126 65 128 37 103 34 77 31 stdev mean
  11. 11. Why might the Ages be Lognormal? <ul><li>The Lognormal can be generated when a random variable is the product of other random variables , just as a Normal distribution can be generated by summing random variables. </li></ul><ul><li>Informally, the conditions are that the constituent random factors be substantially independent, that no one variable dominate the others, and that there be a large number of factors. </li></ul><ul><li>We propose a hypothetical model of the defect repair process including realistic multiplicative factors and approximating the mathematical conditions. </li></ul>
  12. 12. Seven hypothetical factors affecting resolution time There is a 4% Probability the Priority is P1, and if so the Time multiplier is .5, etc Probabilities, as percent, each column totals 100. Time multiplier, selected with appropriate probability Subtle Hard Moderate Obvious DIFFICULTY Misleading Oversights Well Written Complete BUG CLARITY Novice Minimal Moderate Practiced SKILLS Slow Average Fast Superstar SPEED None Substitute P4 Inadequate Remote P3 Workable Shared/Wait P2 Specific Available P1 TOOLS RESOURCES PRIORITY 10 40 40 10 10 40 25 25 20 30 30 20 10 40 40 10 25 10 10 25 20 76 25 30 10 25 40 4 3 2 1 .5 2 1.2 .8 .5 1.7 1.2 .9 .6 3 1.5 1 .5 1.7 3 4 1.4 2 2 .9 1.5 1 .8 1 .5
  13. 13. Seven hypothetical factors affecting resolution time Drawn from experience and COCOMO Subtle Hard Moderate Obvious DIFFICULTY Misleading Oversights Well Written Complete BUG CLARITY Novice Minimal Moderate Practiced SKILLS Slow Average Fast Superstar SPEED None Substitute Inadequate Remote P3 Workable Shared/Wait P2 Specific Available P1 TOOLS RESOURCES PRIORITY
  14. 14. Seven hypothetical factors: tentative distributions There is a 2% Probability the Priority is P1, and if so the Time multiplier is .5, etc For Severity and the other 6 dimensions there is a probability distribution of levels of difficulty We model the distributions by a discrete distribution with 3 or 4 relative levels of difficulty, each with a given probablility Probabilities add to 1.0, i.e. 100% For each factor, we know the variance of the log 0.077 Var. 0.18 Var. 0.12 Var. 0.87 Var. 0.30 Var. 0.22 Var. 0.04 Var 1.4 0.15 4 0.10 1.7 0.20 10 0.20 3 0.20 4 0.20 1.1 0.35 2 0.20 1.2 0.30 3 0.30 2 0.30 2.5 0.30 3.49 0.79 0.9 0.35 1.5 0.30 0.9 0.30 1 0.45 1 0.40 1.5 0.30 1.78 0.19 0.8 0.15 1 0.40 0.6 0.20 0.5 0.05 0.5 0.10 1 0.20 1 0.02 Value   Prob. Value Prob. Value Prob. Value Prob. Value Prob. Value Prob. Value Prob. Tools Resources Skills Speed Difficulty Clarity Severity Process Support Personnel Defect
  15. 15. Is seven factors enough to generate lognormal? <ul><li>MONTE CARLO: randomly chose sets of 7 factors, based on their distributions </li></ul><ul><li>Summing variance of factors, we expect  = 1.372 </li></ul><ul><li>MonteCarlo yielded  = 1.36, no surprise </li></ul>
  16. 16. Data: number of defects fixed in N days or less <ul><li>For fitting models to data the defects were grouped in over 30 buckets representing ranges of ages </li></ul><ul><li>Age zero means bug fixed on day it arrived. </li></ul>
  17. 17. Nine product families
  18. 18. Models considered <ul><li>We have explanation for why rates may be lognormal </li></ul><ul><li>But </li></ul><ul><ul><li>The fit near the origin is not quite right </li></ul></ul><ul><ul><li>The actual age at fix depends on other random conditions </li></ul></ul><ul><ul><li>We use the Laplace Transform (Miller-1985) to convert from rates to times. </li></ul></ul><ul><li>We compare three models </li></ul><ul><ul><li>Exponential (commonly used) </li></ul></ul><ul><ul><li>Lognormal (commonly used) </li></ul></ul><ul><ul><li>Laplace Transform of Lognormal </li></ul></ul>
  19. 19. Conversion from rates (LN) to times (LTLN) <ul><li>Doubly stochastic </li></ul><ul><li>Select rate from lognormal </li></ul><ul><li>Select time from exponential, given that rate. </li></ul>
  20. 20. Comparing product families & models <ul><li>AIC = - 2 * log_likelihood + 2 * num_parameters </li></ul>
  21. 21. Effect of Age Distribution on Reliability <ul><li>Both  and  affect mean, and larger mean time before fix means proportionately more incidents. </li></ul><ul><li>Any bias toward fixing high rate bugs faster will make more than proportional reduction in incidents. </li></ul><ul><li>By definition, Sev 1 and Sev 2 defects are the ones that affect reliability. Aggressive repair of them will make the biggest impact. </li></ul>
  22. 22. Implications for management <ul><li>Result: Factors surely multiplicative </li></ul><ul><li>Suggest: Estimate and manage factors </li></ul><ul><ul><ul><li>Training for novice engineers, or teaming </li></ul></ul></ul><ul><ul><ul><li>Tools for difficult problems </li></ul></ul></ul><ul><ul><ul><li>Documentation for difficult subsystems </li></ul></ul></ul><ul><li>Reduction of classification errors by training </li></ul><ul><ul><ul><li>Classification makes a difference </li></ul></ul></ul><ul><ul><ul><li>Tail on S1 distribution may be due to conversion of S2, even S3, to S1 after some aging. </li></ul></ul></ul>
  23. 23. Opportunities <ul><li>Can we make a combined model (occurrence, repair) ? </li></ul><ul><ul><li>Repair times of defects are LT-Lognormal (PROMISE-2008) </li></ul></ul><ul><ul><li>Defect occurrence rates are Lognormal (ISSRE-1998) </li></ul></ul><ul><ul><li>Occurrence counts are Discrete Lognormal (ISSRE-2005) </li></ul></ul><ul><li>What is typical range for sigma. How hard is it to change? </li></ul>
  24. 24. Other Lognormal Relationships Trouble Tickets = Discrete-LN SRGM = Cumulative Defects = Laplace Transform of LN Test Strategy Ten x the rare rates will find rare-rare interactions 100 times as fast. Equivalent to Heat/Power/ Temp “corner testing” of HW. Multiplicative Rates Limiting Distribution = Lognormal Triggering Conditions Release Strategy Is it ready? Which is best? States, Usage, Code Repair Strategy Risk vs. Benefit ? Removed IO error IO works UBD User error By book Distant Nearby Local Create Open Read RARE UNCOMMON COMMON ETC ETC ETC
  25. 25. Further Reading <ul><li>MIL-HDBK-470A: Designing and Developing Maintainable Products and Systems, - Lognormal Distribution, Aug 1997. (Lognormal is representative of most repair times.) </li></ul><ul><li>R. Mullen, Lognormal Distribution of Software Failure Rates: Origin and Evidence, ISSRE 1998. (re Central Limit Theorem and Lognormal.) </li></ul><ul><li>R. Mullen and S. Gokhale: Software Defect Rediscoveries: A Discrete Lognormal Model, ISSRE 2005. (Further references to Lognormal in SW.) </li></ul><ul><li>B. Schroeder and G. Gibson A large-scale study of failures in high-performance-computing systems, CMU-PDL-05-112, Dec 2005. Later in DSN-2006. (Lognormal provides best fit for repair times). </li></ul>
  26. 26. Thank you & Questions <ul><li>Bob Mullen </li></ul><ul><li>bomullen @ </li></ul>Swapna Gokhale ssg @