• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Predicting Subsystem Defects using Dependency Graph Complexities
 

Predicting Subsystem Defects using Dependency Graph Complexities

on

  • 4,485 views

Presented at ISSRE 2007

Presented at ISSRE 2007

Statistics

Views

Total Views
4,485
Views on SlideShare
4,223
Embed Views
262

Actions

Likes
2
Downloads
0
Comments
0

4 Embeds 262

http://thomas-zimmermann.com 154
http://pages.cpsc.ucalgary.ca 84
http://www.scoop.it 13
http://www.slideshare.net 11

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities Presentation Transcript

    • Predicting Subsystem Failures using Dependency Graph Complexities Thomas Zimmermann, University of Calgary, Canada Nachiappan Nagappan, Microsoft Research, USA
    • Predicting Subsystem Defects using Dependency Graph Complexities search: ISSRE Thomas Zimmermann, University of Calgary, Canada Nachiappan Nagappan, Microsoft Research, USA
    • Bugs are everywhere
    • Bugs are everywhere
    • Bugs are everywhere
    • Quality assurance is limited... ...by time...
    • Quality assurance is limited... ...by time... ...and by money.
    • Resource allocation Spent resources on the components that need it most, i.e., are most likely to fail.
    • Meet Jacob • Your QA manager • Ten years knowledge of your project • Aware of its history and the hot spots
    • Meet Jacob • Your QA manager • Ten years knowledge of your project • Aware of its history and the hot spots • Likes extreme sports
    • Meet Emily • Your new QA manager (replaces Jacob) • Not much experience with your project yet • How can she allocate resources effectively?
    • Meet Emily • Your new QA manager (replaces Jacob) • Not much experience with your project yet • How can she allocate resources effectively?
    • Indicators of failures  Code complexity ◦ Basili et al. 1996, Subramanyam and Krishnan 2003, ◦ Binkley and Schach 1998, Ohlsson and Alberg 1996, ◦ Nagappan et al. 2006  Code churn ◦ Nagappan and Ball 2005  Historical data ◦ Khoshgoftaar et al. 1996, Graves et al. 2000, Kim et al. 2007, ◦ Ostrand et al. 2005, Mockus et al. 2005  Code dependencies ◦ Nagappan and Ball 2007
    • Windows Server 2003
    • Windows Server 2003  2254 Binaries  28.4 MLOC
    • What are dependencies?  Dependency = (directed) relationship between two pieces of code
    • What are dependencies?  Dependency = (directed) relationship between two pieces of code  MaX dependency analysis framework ◦ Caller-callee dependencies ◦ Imports and exports ◦ RPC ◦ COM ◦ Runtime dependencies (such as LoadLibrary) ◦ Registry access ◦ etc.
    • Windows Server layout
    • Windows Server layout
    • Windows Server layout
    • Windows Server layout
    • Complexity of subsystems Subsystem A
    • Complexity of subsystems Subsystem A Subsystem B
    • Complexity of subsystems Subsystem A Subsystem B Which subsystem has more defects?
    • Complexity of subsystems Subsystem A Subsystem B Which subsystem has more defects? Our hypothesis: the more complex one.
    • Observation #1: Cycles Dependency cycles: No dependency cycle:
    • Observation #1: Cycles Dependency cycles: No dependency cycle: Binaries that are part of a dependency cycle have on average twice as many defects.
    • Observation #2: Cliques
    • Observation #2: Cliques
    • Observation #2: Cliques Average number of defects is higher for binaries in large cliques.
    • Data collection
    • Data collection
    • Data collection defects Defects
    • Dependency graphs What is the dependency graph of a subsystem?
    • Dependency graphs INTRA =Internal dependencies
    • Dependency graphs OUT =Outgoing dependencies
    • Dependency graphs DEP =“Neighborhood” =INTRA + OUT + more
    • Complexity measures #Nodes |V| Multiplicity Complexity #Edges |E| |E|-|V|+|P| Degree Density |E|/|V|2 Eccentricity Radius Diameter
    • Spearman correlations
    • Spearman correlations Complexity Measures
    • Spearman correlations Dependency Graphs Complexity Measures
    • Spearman correlations Dependency Graphs Complexity Measures
    • Spearman correlations Dependency Graphs Complexity Measures
    • Spearman correlations Dependency Graphs Complexity Measures
    • Spearman correlations Dependency Graphs Complexity Measures
    • Predicting failures NODES EDGES COMPLEXITY DENSITY DEGREE_MIN DEGREE_MAX DEGREE_AVG ECCENTRICITY_MIN ECCENTRICITY_MAX ECCENTRICITY_AVG MULTI_EDGES MULTI_COMPLEXITY MULTI_DENSITY MULTI_DEGREE_MIN MULTI_DEGREE_MAX MULTI_DEGREE_AVG MULTI_MULTIPLICITY_MIN MULTI_MULTIPLICITY_MAX MULTI_MULTIPLICITY_AVG MULTI_ECCENTRICITY_MIN MULTI_ECCENTRICITY_MAX MULTI_ECCENTRICITY_AVG
    • Predicting failures NODES EDGES COMPLEXITY DENSITY DEGREE_MIN DEGREE_MAX DEGREE_AVG ECCENTRICITY_MIN ECCENTRICITY_MAX ECCENTRICITY_AVG MULTI_EDGES MULTI_COMPLEXITY MULTI_DENSITY INTRA MULTI_DEGREE_MIN MULTI_DEGREE_MAX OUT MULTI_DEGREE_AVG MULTI_MULTIPLICITY_MIN MULTI_MULTIPLICITY_MAX DEP COMBINED MULTI_MULTIPLICITY_AVG MULTI_ECCENTRICITY_MIN MULTI_ECCENTRICITY_MAX MULTI_ECCENTRICITY_AVG
    • Ranking
    • Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more)
    • Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more)
    • Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more)
    • Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more)
    • Ranking Rank Subsystem Actual Rank 1 K 3 2 L 95 3 C 6 4 G 2 5 F 8 6 A 3 7 Y 12 8 O 1 9 B 18 10 M 35 ... (many more) Spearman correlation
    • Random splits
    • Random splits 4×50×
    • Random splits 4×50×
    • Linear regression
    • Linear regression
    • Linear regression A higher predicted rank corresponds to a higher observed rank
    • Impact of granularity
    • Impact of granularity The predictions are more reliable for coarse granularities…
    • Impact of granularity The predictions are more reliable for coarse granularities… …at the cost of locality and stability.
    • Future work
    • Future work • Assemble the pieces of the puzzle • Evolution of dependencies predictors? Are churned dependencies better • Development process development? What’s the impact of, say, global • Human and social factors
    • Conclusion • Cycles and cliques correlate with defects. • The complexity of the dependency structure predicts the number of defects. • Defect predictions help to allocate resources for QA more effectively. Slides on Slideshare.net (search for ISSRE)
    • Contact Email: tz@acm.org nachin@microsoft.com Internet: www.softevo.org research.microsoft.com/esm