SlideShare a Scribd company logo
0/10
26th International Conference on Software Engineering (ICSE), Edinburgh, 28.05.2004




Mining Version Histories
to Guide Software Changes

Thomas Zimmermann
(with Peter Weißgerber, Stephan Diehl, and Andreas Zeller)
Lehrstuhl Softwaretechnik
Universität des Saarlandes, Saarbrücken
Extending ECLIPSE Preferences                      1/10


Your task: Extend ECLIPSE with a new preference.
Extending ECLIPSE Preferences                      1/10


Your task: Extend ECLIPSE with a new preference.




Preferences are stored in field fKeys[]:
Extending ECLIPSE Preferences              2/10


What else do you need to change?

Which of the 27,000 files
             20,000 classes
             200,000 methods of ECLIPSE?
Extending ECLIPSE Preferences                           2/10


What else do you need to change?

Which of the 27,000 files
             20,000 classes
             200,000 methods of ECLIPSE?

Program analysis.
   fKeys[] and initDefaults() use the same variables.

    – Usage does not induce change.
    – Usage can be detected only within program code.
      ECLIPSE has 12,000 non-JAVA files
Extending ECLIPSE Preferences                           2/10


What else do you need to change?

Which of the 27,000 files
             20,000 classes
             200,000 methods of ECLIPSE?

Program analysis.
   fKeys[] and initDefaults() use the same variables.

    – Usage does not induce change.
    – Usage can be detected only within program code.
      ECLIPSE has 12,000 non-JAVA files

Learning from history.
   Programmers who changed fKeys[] also changed…
Guiding the Programmer     3/10




 A) The user inserts a
 new preference into
 the field fKeys[]




 B) ROSE suggests
 locations for further
 changes, e.g. the
 function initDefaults()
From CVS to Transactions                                     4/10


The ECLIPSE CVS archive has more than 47,000 transactions.
From CVS to Transactions                                     4/10


The ECLIPSE CVS archive has more than 47,000 transactions.




                             !
Mining Association Rules                                                       5/10


ROSE takes all transactions as input:

     T42   =   {   fKeys[],   initDefaults(),   …,   plugin.properties, …}
    T752   =   {   fKeys[],   initDefaults(),   …,   plugin.properties, …}
   T9872   =   {   fKeys[],   initDefaults(),   …,   plugin.properties, …}
  T11386   =   {   fKeys[],   initDefaults(),   …}
  T20814   =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
  T30989   =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
  T41999   =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
  T47423   =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
           .
           .
           .
Mining Association Rules                                                        5/10


ROSE takes all transactions as input:

     T42    =   {   fKeys[],   initDefaults(),   …,   plugin.properties, …}
    T752    =   {   fKeys[],   initDefaults(),   …,   plugin.properties, …}
   T9872    =   {   fKeys[],   initDefaults(),   …,   plugin.properties, …}
  T11386    =   {   fKeys[],   initDefaults(),   …}
  T20814    =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
  T30989    =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
  T41999    =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
  T47423    =   {   fKeys[],   initDefaults(),   …,   plugin.properties,   …}
            .
            .
            .

ROSE mines association rules from these transactions:

           { fKeys[], initDefaults() } ⇒ { plugin.properties }
                 [Support 7, Confidence 7/8 = 0.875]
Effective Mining                                                   6/10


The classical association mining approach is to mine all rules:

 – Helpful in understanding general patterns.
 – Requires high support thresholds (>2n possible rules).
 – Takes time to compute (3 days and more).
Effective Mining                                                   6/10


The classical association mining approach is to mine all rules:

 – Helpful in understanding general patterns.
 – Requires high support thresholds (>2n possible rules).
 – Takes time to compute (3 days and more).


Alternative — mine only matching rules on demand:

Constraints on antecedent. Mine only rules which are related
  to the situation Σ, e.g. Σ ⇒ X
Single consequent rules. Mine only rules which have a
   singleton as consequent, e.g. Σ ⇒ {x}

Average runtime of a query: 0.5 seconds.
Precision vs. Recall                                                7/10


What ROSE finds                               What it should find




 False positives                                 False negatives
                         Correct prediction


Precision How many of the returned entities are relevant?
   High precision = few false positives
Recall How many relevant entities are returned?
   High recall = few false negatives
Evaluation                                                      8/10


The programmer has changed one single entity.
Can ROSE suggest other entities that should be changed?

       Granularity        Entities
       Project     Recall Precision Top3
       ECLIPSE     0.15     0.26    0.53
       GCC         0.28     0.39    0.89
       GIMP        0.12     0.25    0.91
       JBOSS       0.16     0.38    0.69
       JEDIT       0.07     0.16    0.52
       KOFFICE     0.08     0.17    0.46
       POSTGRES    0.13     0.23    0.59
       PYTHON      0.14     0.24    0.51
       Average     0.15     0.26    0.64


     ROSE predicts 15% of all changed entities
 In 64% of all transactions, ROSE’s topmost three suggestions
              contain a correct entity
Evaluation                                                       8/10


The programmer has changed one single entity.
Can ROSE suggest other entities that should be changed?

       Granularity        Entities                Files
       Project     Recall Precision Top3 Recall Precision Top3
       ECLIPSE     0.15     0.26    0.53 0.17     0.26    0.54
       GCC         0.28     0.39    0.89 0.44     0.42    0.87
       GIMP        0.12     0.25    0.91 0.27     0.26    0.90
       JBOSS       0.16     0.38    0.69 0.25     0.37    0.64
       JEDIT       0.07     0.16    0.52 0.25     0.22    0.68
       KOFFICE     0.08     0.17    0.46 0.24     0.26    0.67
       POSTGRES    0.13     0.23    0.59 0.23     0.24    0.68
       PYTHON      0.14     0.24    0.51 0.24     0.36    0.60
       Average     0.15     0.26    0.64 0.26     0.30    0.70


     ROSE predicts 15% of all changed entities (files: 26%).
 In 64% of all transactions, ROSE’s topmost three suggestions
              contain a correct entity (files: 70%).
Challenges                                               9/10


Further Data Sources.
   Test outcomes, Mailing lists, Newsgroups, Chat logs
   How do we leverage these sources?
Challenges                                               9/10


Further Data Sources.
   Test outcomes, Mailing lists, Newsgroups, Chat logs
   How do we leverage these sources?
Further Analyses.
   Program analysis, Sequence analysis, Clustering
   How do we integrate different analyses?
Challenges                                                9/10


Further Data Sources.
   Test outcomes, Mailing lists, Newsgroups, Chat logs
   How do we leverage these sources?
Further Analyses.
   Program analysis, Sequence analysis, Clustering
   How do we integrate different analyses?
From Locations to Actions.
   You have extended fKeys[] with UI_SPLINES;
   ROSE suggests:
         Insert store.setDefaults(UI_SPLINES, false);
         in function initDefaults();
   The user can accept this at the touch of one button.
   How much can we learn from history?
Conclusion                                                     10/10


5 ROSE detects coupling between non-program entities
  (e.g. programs and documentation).
5 ROSE effectively guides users along related changes.
5 In 64% of all transactions, ROSE’s topmost three
  suggestions contain a correct entity (files: 70%).
5 Research has just begun to exploit non-program artefacts:
   – Similar results by A. Ying (2004); A. Hassan (2004);
     and J. Sayyad-Shirabad (2003).
   – ICSE Workshop on Mining Software Repositories, 2004.

5 ROSE will be available as an ECLIPSE plug-in in Fall 2004:
          http://www.st.cs.uni-sb.de/softevo/

More Related Content

What's hot

Medevac Safety - Helecopter
Medevac Safety - HelecopterMedevac Safety - Helecopter
Medevac Safety - Helecopter
VASS Yukon
 
F 15 vs su-27
F 15 vs su-27F 15 vs su-27
F 15 vs su-27mishanbgd
 
Mig-21and fighter maneuverability in today's terms
Mig-21and fighter maneuverability in today's termsMig-21and fighter maneuverability in today's terms
Mig-21and fighter maneuverability in today's termsmishanbgd
 
Malware Evasion Techniques
Malware Evasion TechniquesMalware Evasion Techniques
Malware Evasion Techniques
Thomas Roccia
 
A Presentation About U.S. Hypersonic Weapons and Alternatives
A Presentation About U.S. Hypersonic Weapons and AlternativesA Presentation About U.S. Hypersonic Weapons and Alternatives
A Presentation About U.S. Hypersonic Weapons and Alternatives
Congressional Budget Office
 
Ballistic missile defense system
Ballistic missile defense systemBallistic missile defense system
Ballistic missile defense system
MIT
 
Lockheed Martin F35 Lightning II(Propulsion Presentation)
Lockheed Martin F35 Lightning II(Propulsion Presentation)Lockheed Martin F35 Lightning II(Propulsion Presentation)
Lockheed Martin F35 Lightning II(Propulsion Presentation)
SYEDMOEEDHUSSAIN1
 
Fighter Jet Planes
Fighter Jet PlanesFighter Jet Planes
Fighter Jet Planes
Sundeep Malik
 
Presentation On Fighter Planes
Presentation On Fighter PlanesPresentation On Fighter Planes
Presentation On Fighter PlanesKunal Dhingra
 
Falcon heavy Reusable Launch Vehicle- SpaceX
Falcon heavy Reusable Launch Vehicle- SpaceXFalcon heavy Reusable Launch Vehicle- SpaceX
Falcon heavy Reusable Launch Vehicle- SpaceX
Ashish Singh
 
Hypersonic Flight_Dora_Musielak_2016
Hypersonic Flight_Dora_Musielak_2016Hypersonic Flight_Dora_Musielak_2016
Hypersonic Flight_Dora_Musielak_2016Dora Musielak, Ph.D.
 
Stevenson f 22 brief
Stevenson f 22 briefStevenson f 22 brief
Stevenson f 22 briefPicard578
 
Mechanics and types of wings of air planes
Mechanics and types of wings of air planesMechanics and types of wings of air planes
Mechanics and types of wings of air planes
Dave Madhav
 
Fuerza centrípeta en la aviación
Fuerza centrípeta en la aviaciónFuerza centrípeta en la aviación
Fuerza centrípeta en la aviaciónMoniz Aguilera
 
Flight basics
Flight basicsFlight basics
Flight basics
Sri Ramya
 
6 computing gunsight, hud and hms
6 computing gunsight, hud and hms6 computing gunsight, hud and hms
6 computing gunsight, hud and hms
Solo Hermelin
 
How to Fly and Fight in Mig 29 Fulcrum
How to Fly and Fight in Mig 29 FulcrumHow to Fly and Fight in Mig 29 Fulcrum
How to Fly and Fight in Mig 29 FulcrumLige Tesla
 

What's hot (20)

Medevac Safety - Helecopter
Medevac Safety - HelecopterMedevac Safety - Helecopter
Medevac Safety - Helecopter
 
F 15 vs su-27
F 15 vs su-27F 15 vs su-27
F 15 vs su-27
 
Mig-21and fighter maneuverability in today's terms
Mig-21and fighter maneuverability in today's termsMig-21and fighter maneuverability in today's terms
Mig-21and fighter maneuverability in today's terms
 
Malware Evasion Techniques
Malware Evasion TechniquesMalware Evasion Techniques
Malware Evasion Techniques
 
A Presentation About U.S. Hypersonic Weapons and Alternatives
A Presentation About U.S. Hypersonic Weapons and AlternativesA Presentation About U.S. Hypersonic Weapons and Alternatives
A Presentation About U.S. Hypersonic Weapons and Alternatives
 
Ballistic missile defense system
Ballistic missile defense systemBallistic missile defense system
Ballistic missile defense system
 
Lockheed Martin F35 Lightning II(Propulsion Presentation)
Lockheed Martin F35 Lightning II(Propulsion Presentation)Lockheed Martin F35 Lightning II(Propulsion Presentation)
Lockheed Martin F35 Lightning II(Propulsion Presentation)
 
Fighter Jet Planes
Fighter Jet PlanesFighter Jet Planes
Fighter Jet Planes
 
Helicopters
HelicoptersHelicopters
Helicopters
 
Presentation On Fighter Planes
Presentation On Fighter PlanesPresentation On Fighter Planes
Presentation On Fighter Planes
 
Falcon heavy Reusable Launch Vehicle- SpaceX
Falcon heavy Reusable Launch Vehicle- SpaceXFalcon heavy Reusable Launch Vehicle- SpaceX
Falcon heavy Reusable Launch Vehicle- SpaceX
 
Aerodrome Operating Minima
Aerodrome Operating MinimaAerodrome Operating Minima
Aerodrome Operating Minima
 
Hypersonic Flight_Dora_Musielak_2016
Hypersonic Flight_Dora_Musielak_2016Hypersonic Flight_Dora_Musielak_2016
Hypersonic Flight_Dora_Musielak_2016
 
Stevenson f 22 brief
Stevenson f 22 briefStevenson f 22 brief
Stevenson f 22 brief
 
Mechanics and types of wings of air planes
Mechanics and types of wings of air planesMechanics and types of wings of air planes
Mechanics and types of wings of air planes
 
Fuerza centrípeta en la aviación
Fuerza centrípeta en la aviaciónFuerza centrípeta en la aviación
Fuerza centrípeta en la aviación
 
How helicopters fly
How helicopters flyHow helicopters fly
How helicopters fly
 
Flight basics
Flight basicsFlight basics
Flight basics
 
6 computing gunsight, hud and hms
6 computing gunsight, hud and hms6 computing gunsight, hud and hms
6 computing gunsight, hud and hms
 
How to Fly and Fight in Mig 29 Fulcrum
How to Fly and Fight in Mig 29 FulcrumHow to Fly and Fight in Mig 29 Fulcrum
How to Fly and Fight in Mig 29 Fulcrum
 

Similar to Mining Version Histories to Guide Software Changes

Limits Profiling
Limits ProfilingLimits Profiling
Limits Profiling
Adrian Larson
 
Industrial plant optimization in reduced dimensional spaces
Industrial plant optimization in reduced dimensional spacesIndustrial plant optimization in reduced dimensional spaces
Industrial plant optimization in reduced dimensional spaces
Capstone
 
SenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay Platonov
SenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay PlatonovSenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay Platonov
SenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay Platonov
Sencha
 
A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software developmentMartin Pinzger
 
muCon 2017 - Build Confidence in your System with Chaos Engineering
muCon 2017 - Build Confidence in your System with Chaos EngineeringmuCon 2017 - Build Confidence in your System with Chaos Engineering
muCon 2017 - Build Confidence in your System with Chaos Engineering
Sylvain Hellegouarch
 
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
James Salter
 
MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0
oysteing
 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey
J On The Beach
 
Junhua wang ai_next_con
Junhua wang ai_next_conJunhua wang ai_next_con
Junhua wang ai_next_con
Junhua Wang
 
Deep Learning Inference at speed and scale
Deep Learning Inference at speed and scaleDeep Learning Inference at speed and scale
Deep Learning Inference at speed and scale
Bill Liu
 
MySQL Group Replication - Ready For Production? (2018-04)
MySQL Group Replication - Ready For Production? (2018-04)MySQL Group Replication - Ready For Production? (2018-04)
MySQL Group Replication - Ready For Production? (2018-04)
Kenny Gryp
 
A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...
Rafael Ferreira da Silva
 
JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...
JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...
JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...
Juan Cruz Nores
 
Low latency in java 8 v5
Low latency in java 8 v5Low latency in java 8 v5
Low latency in java 8 v5
Peter Lawrey
 
Designing Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.ProtoDesigning Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.Proto
Joel Falcou
 
Ducasse's Maintenance Expertise
Ducasse's Maintenance ExpertiseDucasse's Maintenance Expertise
Ducasse's Maintenance ExpertiseStéphane Ducasse
 
Simplified Data Processing On Large Cluster
Simplified Data Processing On Large ClusterSimplified Data Processing On Large Cluster
Simplified Data Processing On Large Cluster
Harsh Kevadia
 
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffAnalyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Martin Pinzger
 
Wcre12c.ppt
Wcre12c.pptWcre12c.ppt
Wcre12c.ppt
Ptidej Team
 
Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013
Sri Ambati
 

Similar to Mining Version Histories to Guide Software Changes (20)

Limits Profiling
Limits ProfilingLimits Profiling
Limits Profiling
 
Industrial plant optimization in reduced dimensional spaces
Industrial plant optimization in reduced dimensional spacesIndustrial plant optimization in reduced dimensional spaces
Industrial plant optimization in reduced dimensional spaces
 
SenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay Platonov
SenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay PlatonovSenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay Platonov
SenchaCon 2016: Handling Undo-Redo in Sencha Applications - Nickolay Platonov
 
A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software development
 
muCon 2017 - Build Confidence in your System with Chaos Engineering
muCon 2017 - Build Confidence in your System with Chaos EngineeringmuCon 2017 - Build Confidence in your System with Chaos Engineering
muCon 2017 - Build Confidence in your System with Chaos Engineering
 
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
An Efficient Reactive Model for Resource Discovery in DHT-Based Peer-to-Peer ...
 
MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0MySQL Optimizer: What’s New in 8.0
MySQL Optimizer: What’s New in 8.0
 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey
 
Junhua wang ai_next_con
Junhua wang ai_next_conJunhua wang ai_next_con
Junhua wang ai_next_con
 
Deep Learning Inference at speed and scale
Deep Learning Inference at speed and scaleDeep Learning Inference at speed and scale
Deep Learning Inference at speed and scale
 
MySQL Group Replication - Ready For Production? (2018-04)
MySQL Group Replication - Ready For Production? (2018-04)MySQL Group Replication - Ready For Production? (2018-04)
MySQL Group Replication - Ready For Production? (2018-04)
 
A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...A science-gateway for workflow executions: online and non-clairvoyant self-h...
A science-gateway for workflow executions: online and non-clairvoyant self-h...
 
JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...
JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...
JavaOne 2016: Code Generation with JavaCompiler for Fun, Speed and Business P...
 
Low latency in java 8 v5
Low latency in java 8 v5Low latency in java 8 v5
Low latency in java 8 v5
 
Designing Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.ProtoDesigning Architecture-aware Library using Boost.Proto
Designing Architecture-aware Library using Boost.Proto
 
Ducasse's Maintenance Expertise
Ducasse's Maintenance ExpertiseDucasse's Maintenance Expertise
Ducasse's Maintenance Expertise
 
Simplified Data Processing On Large Cluster
Simplified Data Processing On Large ClusterSimplified Data Processing On Large Cluster
Simplified Data Processing On Large Cluster
 
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffAnalyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
 
Wcre12c.ppt
Wcre12c.pptWcre12c.ppt
Wcre12c.ppt
 
Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013Jan vitek distributedrandomforest_5-2-2013
Jan vitek distributedrandomforest_5-2-2013
 

More from Thomas Zimmermann

Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing Information
Thomas Zimmermann
 
MSR 2013 Preview
MSR 2013 PreviewMSR 2013 Preview
MSR 2013 Preview
Thomas Zimmermann
 
Predicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsPredicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode Operations
Thomas Zimmermann
 
Analytics for smarter software development
Analytics for smarter software development Analytics for smarter software development
Analytics for smarter software development
Thomas Zimmermann
 
Characterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedCharacterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get Reopened
Thomas Zimmermann
 
Klingon Countdown Timer
Klingon Countdown TimerKlingon Countdown Timer
Klingon Countdown Timer
Thomas Zimmermann
 
Data driven games user research
Data driven games user researchData driven games user research
Data driven games user research
Thomas Zimmermann
 
Not my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsNot my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignments
Thomas Zimmermann
 
Empirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchEmpirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft Research
Thomas Zimmermann
 
Security trend analysis with CVE topic models
Security trend analysis with CVE topic modelsSecurity trend analysis with CVE topic models
Security trend analysis with CVE topic models
Thomas Zimmermann
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software developmentThomas Zimmermann
 
Characterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedCharacterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixed
Thomas Zimmermann
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
Thomas Zimmermann
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect prediction
Thomas Zimmermann
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
Thomas Zimmermann
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency Graphs
Thomas Zimmermann
 
Quality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceQuality of Bug Reports in Open Source
Quality of Bug Reports in Open Source
Thomas Zimmermann
 
Meet Tom and his Fish
Meet Tom and his FishMeet Tom and his Fish
Meet Tom and his Fish
Thomas Zimmermann
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities
Thomas Zimmermann
 
Got Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringGot Myth? Myths in Software Engineering
Got Myth? Myths in Software Engineering
Thomas Zimmermann
 

More from Thomas Zimmermann (20)

Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing Information
 
MSR 2013 Preview
MSR 2013 PreviewMSR 2013 Preview
MSR 2013 Preview
 
Predicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsPredicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode Operations
 
Analytics for smarter software development
Analytics for smarter software development Analytics for smarter software development
Analytics for smarter software development
 
Characterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedCharacterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get Reopened
 
Klingon Countdown Timer
Klingon Countdown TimerKlingon Countdown Timer
Klingon Countdown Timer
 
Data driven games user research
Data driven games user researchData driven games user research
Data driven games user research
 
Not my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsNot my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignments
 
Empirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchEmpirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft Research
 
Security trend analysis with CVE topic models
Security trend analysis with CVE topic modelsSecurity trend analysis with CVE topic models
Security trend analysis with CVE topic models
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software development
 
Characterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedCharacterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixed
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect prediction
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency Graphs
 
Quality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceQuality of Bug Reports in Open Source
Quality of Bug Reports in Open Source
 
Meet Tom and his Fish
Meet Tom and his FishMeet Tom and his Fish
Meet Tom and his Fish
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities
 
Got Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringGot Myth? Myths in Software Engineering
Got Myth? Myths in Software Engineering
 

Recently uploaded

Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 

Recently uploaded (20)

Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 

Mining Version Histories to Guide Software Changes

  • 1. 0/10 26th International Conference on Software Engineering (ICSE), Edinburgh, 28.05.2004 Mining Version Histories to Guide Software Changes Thomas Zimmermann (with Peter Weißgerber, Stephan Diehl, and Andreas Zeller) Lehrstuhl Softwaretechnik Universität des Saarlandes, Saarbrücken
  • 2. Extending ECLIPSE Preferences 1/10 Your task: Extend ECLIPSE with a new preference.
  • 3. Extending ECLIPSE Preferences 1/10 Your task: Extend ECLIPSE with a new preference. Preferences are stored in field fKeys[]:
  • 4. Extending ECLIPSE Preferences 2/10 What else do you need to change? Which of the 27,000 files 20,000 classes 200,000 methods of ECLIPSE?
  • 5. Extending ECLIPSE Preferences 2/10 What else do you need to change? Which of the 27,000 files 20,000 classes 200,000 methods of ECLIPSE? Program analysis. fKeys[] and initDefaults() use the same variables. – Usage does not induce change. – Usage can be detected only within program code. ECLIPSE has 12,000 non-JAVA files
  • 6. Extending ECLIPSE Preferences 2/10 What else do you need to change? Which of the 27,000 files 20,000 classes 200,000 methods of ECLIPSE? Program analysis. fKeys[] and initDefaults() use the same variables. – Usage does not induce change. – Usage can be detected only within program code. ECLIPSE has 12,000 non-JAVA files Learning from history. Programmers who changed fKeys[] also changed…
  • 7. Guiding the Programmer 3/10 A) The user inserts a new preference into the field fKeys[] B) ROSE suggests locations for further changes, e.g. the function initDefaults()
  • 8. From CVS to Transactions 4/10 The ECLIPSE CVS archive has more than 47,000 transactions.
  • 9. From CVS to Transactions 4/10 The ECLIPSE CVS archive has more than 47,000 transactions. !
  • 10. Mining Association Rules 5/10 ROSE takes all transactions as input: T42 = { fKeys[], initDefaults(), …, plugin.properties, …} T752 = { fKeys[], initDefaults(), …, plugin.properties, …} T9872 = { fKeys[], initDefaults(), …, plugin.properties, …} T11386 = { fKeys[], initDefaults(), …} T20814 = { fKeys[], initDefaults(), …, plugin.properties, …} T30989 = { fKeys[], initDefaults(), …, plugin.properties, …} T41999 = { fKeys[], initDefaults(), …, plugin.properties, …} T47423 = { fKeys[], initDefaults(), …, plugin.properties, …} . . .
  • 11. Mining Association Rules 5/10 ROSE takes all transactions as input: T42 = { fKeys[], initDefaults(), …, plugin.properties, …} T752 = { fKeys[], initDefaults(), …, plugin.properties, …} T9872 = { fKeys[], initDefaults(), …, plugin.properties, …} T11386 = { fKeys[], initDefaults(), …} T20814 = { fKeys[], initDefaults(), …, plugin.properties, …} T30989 = { fKeys[], initDefaults(), …, plugin.properties, …} T41999 = { fKeys[], initDefaults(), …, plugin.properties, …} T47423 = { fKeys[], initDefaults(), …, plugin.properties, …} . . . ROSE mines association rules from these transactions: { fKeys[], initDefaults() } ⇒ { plugin.properties } [Support 7, Confidence 7/8 = 0.875]
  • 12. Effective Mining 6/10 The classical association mining approach is to mine all rules: – Helpful in understanding general patterns. – Requires high support thresholds (>2n possible rules). – Takes time to compute (3 days and more).
  • 13. Effective Mining 6/10 The classical association mining approach is to mine all rules: – Helpful in understanding general patterns. – Requires high support thresholds (>2n possible rules). – Takes time to compute (3 days and more). Alternative — mine only matching rules on demand: Constraints on antecedent. Mine only rules which are related to the situation Σ, e.g. Σ ⇒ X Single consequent rules. Mine only rules which have a singleton as consequent, e.g. Σ ⇒ {x} Average runtime of a query: 0.5 seconds.
  • 14. Precision vs. Recall 7/10 What ROSE finds What it should find False positives False negatives Correct prediction Precision How many of the returned entities are relevant? High precision = few false positives Recall How many relevant entities are returned? High recall = few false negatives
  • 15. Evaluation 8/10 The programmer has changed one single entity. Can ROSE suggest other entities that should be changed? Granularity Entities Project Recall Precision Top3 ECLIPSE 0.15 0.26 0.53 GCC 0.28 0.39 0.89 GIMP 0.12 0.25 0.91 JBOSS 0.16 0.38 0.69 JEDIT 0.07 0.16 0.52 KOFFICE 0.08 0.17 0.46 POSTGRES 0.13 0.23 0.59 PYTHON 0.14 0.24 0.51 Average 0.15 0.26 0.64 ROSE predicts 15% of all changed entities In 64% of all transactions, ROSE’s topmost three suggestions contain a correct entity
  • 16. Evaluation 8/10 The programmer has changed one single entity. Can ROSE suggest other entities that should be changed? Granularity Entities Files Project Recall Precision Top3 Recall Precision Top3 ECLIPSE 0.15 0.26 0.53 0.17 0.26 0.54 GCC 0.28 0.39 0.89 0.44 0.42 0.87 GIMP 0.12 0.25 0.91 0.27 0.26 0.90 JBOSS 0.16 0.38 0.69 0.25 0.37 0.64 JEDIT 0.07 0.16 0.52 0.25 0.22 0.68 KOFFICE 0.08 0.17 0.46 0.24 0.26 0.67 POSTGRES 0.13 0.23 0.59 0.23 0.24 0.68 PYTHON 0.14 0.24 0.51 0.24 0.36 0.60 Average 0.15 0.26 0.64 0.26 0.30 0.70 ROSE predicts 15% of all changed entities (files: 26%). In 64% of all transactions, ROSE’s topmost three suggestions contain a correct entity (files: 70%).
  • 17. Challenges 9/10 Further Data Sources. Test outcomes, Mailing lists, Newsgroups, Chat logs How do we leverage these sources?
  • 18. Challenges 9/10 Further Data Sources. Test outcomes, Mailing lists, Newsgroups, Chat logs How do we leverage these sources? Further Analyses. Program analysis, Sequence analysis, Clustering How do we integrate different analyses?
  • 19. Challenges 9/10 Further Data Sources. Test outcomes, Mailing lists, Newsgroups, Chat logs How do we leverage these sources? Further Analyses. Program analysis, Sequence analysis, Clustering How do we integrate different analyses? From Locations to Actions. You have extended fKeys[] with UI_SPLINES; ROSE suggests: Insert store.setDefaults(UI_SPLINES, false); in function initDefaults(); The user can accept this at the touch of one button. How much can we learn from history?
  • 20. Conclusion 10/10 5 ROSE detects coupling between non-program entities (e.g. programs and documentation). 5 ROSE effectively guides users along related changes. 5 In 64% of all transactions, ROSE’s topmost three suggestions contain a correct entity (files: 70%). 5 Research has just begun to exploit non-program artefacts: – Similar results by A. Ying (2004); A. Hassan (2004); and J. Sayyad-Shirabad (2003). – ICSE Workshop on Mining Software Repositories, 2004. 5 ROSE will be available as an ECLIPSE plug-in in Fall 2004: http://www.st.cs.uni-sb.de/softevo/