Report

Follow

•2 likes•1,353 views

•2 likes•1,353 views

Report

Presented at ISSRE 2007

Follow

- 1. Predicting Subsystem Failures using Dependency Graph Complexities Thomas Zimmermann, University of Calgary, Canada Nachiappan Nagappan, Microsoft Research, USA
- 2. Predicting Subsystem Defects using Dependency Graph Complexities search: ISSRE Thomas Zimmermann, University of Calgary, Canada Nachiappan Nagappan, Microsoft Research, USA
- 7. Quality assurance is limited... ...by time...
- 8. Quality assurance is limited... ...by time... ...and by money.
- 9. Resource allocation Spent resources on the components that need it most, i.e., are most likely to fail.
- 10. Meet Jacob • Your QA manager • Ten years knowledge of your project • Aware of its history and the hot spots
- 11. Meet Jacob • Your QA manager • Ten years knowledge of your project • Aware of its history and the hot spots • Likes extreme sports
- 12. Meet Emily • Your new QA manager (replaces Jacob) • Not much experience with your project yet • How can she allocate resources effectively?
- 13. Meet Emily • Your new QA manager (replaces Jacob) • Not much experience with your project yet • How can she allocate resources effectively?
- 14. Indicators of failures Code complexity ◦ Basili et al. 1996, Subramanyam and Krishnan 2003, ◦ Binkley and Schach 1998, Ohlsson and Alberg 1996, ◦ Nagappan et al. 2006 Code churn ◦ Nagappan and Ball 2005 Historical data ◦ Khoshgoftaar et al. 1996, Graves et al. 2000, Kim et al. 2007, ◦ Ostrand et al. 2005, Mockus et al. 2005 Code dependencies ◦ Nagappan and Ball 2007
- 16. Windows Server 2003 2254 Binaries 28.4 MLOC
- 17. What are dependencies? Dependency = (directed) relationship between two pieces of code
- 18. What are dependencies? Dependency = (directed) relationship between two pieces of code MaX dependency analysis framework ◦ Caller-callee dependencies ◦ Imports and exports ◦ RPC ◦ COM ◦ Runtime dependencies (such as LoadLibrary) ◦ Registry access ◦ etc.
- 23. Complexity of subsystems Subsystem A
- 24. Complexity of subsystems Subsystem A Subsystem B
- 25. Complexity of subsystems Subsystem A Subsystem B Which subsystem has more defects?
- 26. Complexity of subsystems Subsystem A Subsystem B Which subsystem has more defects? Our hypothesis: the more complex one.
- 27. Observation #1: Cycles Dependency cycles: No dependency cycle:
- 28. Observation #1: Cycles Dependency cycles: No dependency cycle: Binaries that are part of a dependency cycle have on average twice as many defects.
- 31. Observation #2: Cliques Average number of defects is higher for binaries in large cliques.
- 32. Data collection
- 33. Data collection
- 34. Data collection defects Defects
- 35. Dependency graphs What is the dependency graph of a subsystem?
- 36. Dependency graphs INTRA =Internal dependencies
- 37. Dependency graphs OUT =Outgoing dependencies
- 38. Dependency graphs DEP =“Neighborhood” =INTRA + OUT + more
- 39. Complexity measures #Nodes |V| Multiplicity Complexity #Edges |E| |E|-|V|+|P| Degree Density |E|/|V|2 Eccentricity Radius Diameter
- 41. Spearman correlations Complexity Measures
- 42. Spearman correlations Dependency Graphs Complexity Measures
- 43. Spearman correlations Dependency Graphs Complexity Measures
- 44. Spearman correlations Dependency Graphs Complexity Measures
- 45. Spearman correlations Dependency Graphs Complexity Measures
- 46. Spearman correlations Dependency Graphs Complexity Measures
- 48. Predicting failures NODES EDGES COMPLEXITY DENSITY DEGREE_MIN DEGREE_MAX DEGREE_AVG ECCENTRICITY_MIN ECCENTRICITY_MAX ECCENTRICITY_AVG MULTI_EDGES MULTI_COMPLEXITY MULTI_DENSITY INTRA MULTI_DEGREE_MIN MULTI_DEGREE_MAX OUT MULTI_DEGREE_AVG MULTI_MULTIPLICITY_MIN MULTI_MULTIPLICITY_MAX DEP COMBINED MULTI_MULTIPLICITY_AVG MULTI_ECCENTRICITY_MIN MULTI_ECCENTRICITY_MAX MULTI_ECCENTRICITY_AVG
- 49. Ranking
- 50. Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more)
- 51. Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more)
- 52. Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more)
- 53. Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more)
- 54. Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more) Spearman correlation
- 55. Random splits
- 60. Linear regression A higher predicted rank corresponds to a higher observed rank
- 62. Impact of granularity The predictions are more reliable for coarse granularities…
- 63. Impact of granularity The predictions are more reliable for coarse granularities… …at the cost of locality and stability.
- 64. Future work
- 65. Future work • Assemble the pieces of the puzzle • Evolution of dependencies predictors? Are churned dependencies better • Development process development? What’s the impact of, say, global • Human and social factors
- 66. Conclusion • Cycles and cliques correlate with defects. • The complexity of the dependency structure predicts the number of defects. • Defect predictions help to allocate resources for QA more effectively. Slides on Slideshare.net (search for ISSRE)
- 67. Contact Email: tz@acm.org nachin@microsoft.com Internet: www.softevo.org research.microsoft.com/esm