SlideShare a Scribd company logo
MIP - Cross-Project Defect Prediction
Models: l'Union Fait la Force
Annibale Panichella Rocco Oliveto Andrea De Lucia
1
SANER 2014 (Prev. WCRE)
2
SANER 2014 (Prev. WCRE)
3
Cross-Project Defect Prediction
4
File LOC LCOM DIT … Bug
F1 76 0 2 …
F2 101 1 12 …
… … … … … …
Project A
(Source)
AI Model
|——Software Metrics——|
Project B
(Target)
File LOC LCOM DIT … Bug
F1 176 22 1 … ?
F2 433 144 4 … ?
… … … … … ?
|———Same Metrics———|
Training Test
Friedman test + Nemenyi
Liem and Panichella, Arxiv 2020
The Master Predictor?
5
“One Ring to rule them all”
Some models seem to perform better
than others
Differences among libraries/
implementations
A Deeper Analysis
6
Bayes
Net
AUC = 0.56
Logistic AUC = 0.49
If (dit<10)
If (loc > 5)
1
0
…
True False
True False
Decision
Tree
AUC = 0.57
4
BN
DT
Log
8
9
1
38 10
0
Number of true positives
(identi
fi
ed bugs) for Ant v1.7
L’Union Fait La Force
(Unity Is Strength)
7
Inspired by Peer-Reviewing
8
Manuscript
Reviewer1 Reviewer2 Reviewer3
Strong
Reject
Strong
Accept
Strong
Accept
ACCEPT
Discussion
File
If (dit<10)
If (loc > 5)
1
0
…
True False
True False
Model 1 Model 2 Model 3
Combined Model
(CODEP)
Final Decision
Inspired by Peer-Reviewing
9
File
If (dit<10)
If (loc > 5)
1
0
…
True False
True False
Model 1 Model 2 Model 3
Combined Model
(CODEP)
Final Decision
Base models are diverse (diversity is
the key like in peer-reviewing)
Combination can be done with any
base/meta model (e.g., Naive Bayes Net)
CODEP takes as inputs the predicted
scores (con
fi
dence values) and not the
predicted labels (yes/no) of the base models
Replicating the Study 10 Years Later
10
A quick demo…
Replicating the Study 10 Years Later
11
Over multiple runs (random K-folds),
CODEP shows statistically better
performance than the base model
Some models have larger results
variance (less stable) than others
AU
C
The Impact in the Community
12
“Some classi
fi
ers are more consistent in
predicting defects than others.”
Bowes et al., PROMISE 2015
“Despite similar predictive performance values
for these four classi
fi
ers, each detects different
sets of defects.”
Similar results also showed by
• Bowes et al., in SQJ 2017
• Chen et al., JSEP 2019
13
Other combination strategies:
• Ensemble (Majority voting)
• Ensemble Stacking (Different ML models)
• Classi
fi
er selection (Adaptive per target project)
The Impact in the Community
0
3
6
9
12
2016 2017 2018 2019 2020 2021 2022
# Follow-up papers on
Ensemble methods
(Directly citing CODEP)
150+ Papers on
Google scholar
on Ensemble and
combination
methods for
Defect Prediction
Majority voting is the best…
Majority voting is weak…
Bootstrap aggregation is the
best ever…
Voting average is the top-one
14
Further studies on the role of model diversity on combination performances
• Petric et al., ESEM 2016 “Diversity […] is essential"
• Chen et al., IEEETransactions or Reliability, 2023
• Sharma et al., JSS 2023
• …
Combination of different ML models for federated learning (DSA 2023)
Beyond defect prediction:
• Malware detection
• Effort estimation
• GitHub PR prediction
• …
The Impact in the Community
What About Current Trends in SE?
15
The Hot-Topic Waves
16
Standard
ML
Time
Research
Interests
Deep
Learning
LLMs
Random
Forest
Every 2-4 years (wave) we jump on new
techniques hoping to
fi
nd the “master ring”
We often forget the free-lunch theorem
in optimization and machine learning
Even a stopped clock is right twice in a day
(simple baseline are stronger than we think)
We should focus on understanding when to
use X orY rather than focusing on X vs.Y
(l’union fait la force)
MIP - Cross-Project Defect Prediction
Models: l'Union Fait la Force
Annibale Panichella Rocco Oliveto Andrea De Lucia
17

More Related Content

Similar to MIP Award presentation at the IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) 2024

Contrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation AlgorithmsContrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation Algorithms
Marco Rossetti
 
From Bugs to Decision Support - Selected Research Highlights
From Bugs to Decision Support - Selected Research HighlightsFrom Bugs to Decision Support - Selected Research Highlights
From Bugs to Decision Support - Selected Research Highlights
Markus Borg
 
Benchmarking machine learning techniques
Benchmarking machine learning techniquesBenchmarking machine learning techniques
Benchmarking machine learning techniques
ijseajournal
 
Master’s Thesis.pptx
Master’s Thesis.pptxMaster’s Thesis.pptx
Master’s Thesis.pptx
usmimukherjee18
 
Cukic Promise08 V3
Cukic Promise08 V3Cukic Promise08 V3
Cukic Promise08 V3
gregoryg
 
A Hybrid Approach to Expert and Model Based Effort Estimation
A Hybrid Approach to Expert and Model Based Effort Estimation  A Hybrid Approach to Expert and Model Based Effort Estimation
A Hybrid Approach to Expert and Model Based Effort Estimation
CS, NcState
 
Predicting More from Less: Synergies of Learning
Predicting More from Less: Synergies of LearningPredicting More from Less: Synergies of Learning
Predicting More from Less: Synergies of Learning
CS, NcState
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
Tao Xie
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.
Lionel Briand
 
Software Design Patterns Lecutre (Intro)
Software Design Patterns Lecutre (Intro)Software Design Patterns Lecutre (Intro)
Software Design Patterns Lecutre (Intro)
VikramJothyPrakash1
 
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
DataScienceConferenc1
 
Technical report jada
Technical report jadaTechnical report jada
Technical report jada
University of Bari (Italy)
 
Promise 2011: "A Principled Evaluation of Ensembles of Learning Machines for ...
Promise 2011: "A Principled Evaluation of Ensembles of Learning Machines for ...Promise 2011: "A Principled Evaluation of Ensembles of Learning Machines for ...
Promise 2011: "A Principled Evaluation of Ensembles of Learning Machines for ...
CS, NcState
 
Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...
Chakkrit (Kla) Tantithamthavorn
 
Abstract.doc
Abstract.docAbstract.doc
Abstract.doc
butest
 
Final Survey On Rule Base Development Slideshare
Final Survey On Rule Base Development SlideshareFinal Survey On Rule Base Development Slideshare
Final Survey On Rule Base Development Slideshare
Valentin Zacharias
 
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
ACM Chicago
 
Intelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software EngineeringIntelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software Engineering
Tao Xie
 
Bits of Evidence
Bits of EvidenceBits of Evidence
Bits of Evidence
Greg Wilson
 
Comparative Performance Analysis of Machine Learning Techniques for Software ...
Comparative Performance Analysis of Machine Learning Techniques for Software ...Comparative Performance Analysis of Machine Learning Techniques for Software ...
Comparative Performance Analysis of Machine Learning Techniques for Software ...
csandit
 

Similar to MIP Award presentation at the IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) 2024 (20)

Contrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation AlgorithmsContrasting Offline and Online Results when Evaluating Recommendation Algorithms
Contrasting Offline and Online Results when Evaluating Recommendation Algorithms
 
From Bugs to Decision Support - Selected Research Highlights
From Bugs to Decision Support - Selected Research HighlightsFrom Bugs to Decision Support - Selected Research Highlights
From Bugs to Decision Support - Selected Research Highlights
 
Benchmarking machine learning techniques
Benchmarking machine learning techniquesBenchmarking machine learning techniques
Benchmarking machine learning techniques
 
Master’s Thesis.pptx
Master’s Thesis.pptxMaster’s Thesis.pptx
Master’s Thesis.pptx
 
Cukic Promise08 V3
Cukic Promise08 V3Cukic Promise08 V3
Cukic Promise08 V3
 
A Hybrid Approach to Expert and Model Based Effort Estimation
A Hybrid Approach to Expert and Model Based Effort Estimation  A Hybrid Approach to Expert and Model Based Effort Estimation
A Hybrid Approach to Expert and Model Based Effort Estimation
 
Predicting More from Less: Synergies of Learning
Predicting More from Less: Synergies of LearningPredicting More from Less: Synergies of Learning
Predicting More from Less: Synergies of Learning
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.
 
Software Design Patterns Lecutre (Intro)
Software Design Patterns Lecutre (Intro)Software Design Patterns Lecutre (Intro)
Software Design Patterns Lecutre (Intro)
 
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
 
Technical report jada
Technical report jadaTechnical report jada
Technical report jada
 
Promise 2011: "A Principled Evaluation of Ensembles of Learning Machines for ...
Promise 2011: "A Principled Evaluation of Ensembles of Learning Machines for ...Promise 2011: "A Principled Evaluation of Ensembles of Learning Machines for ...
Promise 2011: "A Principled Evaluation of Ensembles of Learning Machines for ...
 
Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...
 
Abstract.doc
Abstract.docAbstract.doc
Abstract.doc
 
Final Survey On Rule Base Development Slideshare
Final Survey On Rule Base Development SlideshareFinal Survey On Rule Base Development Slideshare
Final Survey On Rule Base Development Slideshare
 
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
 
Intelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software EngineeringIntelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software Engineering
 
Bits of Evidence
Bits of EvidenceBits of Evidence
Bits of Evidence
 
Comparative Performance Analysis of Machine Learning Techniques for Software ...
Comparative Performance Analysis of Machine Learning Techniques for Software ...Comparative Performance Analysis of Machine Learning Techniques for Software ...
Comparative Performance Analysis of Machine Learning Techniques for Software ...
 

More from Annibale Panichella

Breaking the Silence: the Threats of Using LLMs in Software Engineering
Breaking the Silence: the Threats of Using LLMs in Software EngineeringBreaking the Silence: the Threats of Using LLMs in Software Engineering
Breaking the Silence: the Threats of Using LLMs in Software Engineering
Annibale Panichella
 
Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...
Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...
Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...
Annibale Panichella
 
A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...
A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...
A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...
Annibale Panichella
 
An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...
An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...
An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...
Annibale Panichella
 
VST2022.pdf
VST2022.pdfVST2022.pdf
VST2022.pdf
Annibale Panichella
 
IPA Fall Days 2019
 IPA Fall Days 2019 IPA Fall Days 2019
IPA Fall Days 2019
Annibale Panichella
 
An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...
An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...
An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...
Annibale Panichella
 
Speeding-up Software Testing With Computational Intelligence
Speeding-up Software Testing With Computational IntelligenceSpeeding-up Software Testing With Computational Intelligence
Speeding-up Software Testing With Computational Intelligence
Annibale Panichella
 
Incremental Control Dependency Frontier Exploration for Many-Criteria Test C...
Incremental Control Dependency Frontier Exploration for Many-Criteria  Test C...Incremental Control Dependency Frontier Exploration for Many-Criteria  Test C...
Incremental Control Dependency Frontier Exploration for Many-Criteria Test C...
Annibale Panichella
 
Sbst2018 contest2018
Sbst2018 contest2018Sbst2018 contest2018
Sbst2018 contest2018
Annibale Panichella
 
Java Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth RoundJava Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth Round
Annibale Panichella
 
ICSE 2017 - Evocrash
ICSE 2017 - EvocrashICSE 2017 - Evocrash
ICSE 2017 - Evocrash
Annibale Panichella
 
Evolutionary Testing for Crash Reproduction
Evolutionary Testing for Crash ReproductionEvolutionary Testing for Crash Reproduction
Evolutionary Testing for Crash Reproduction
Annibale Panichella
 
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Annibale Panichella
 
Security Threat Identification and Testing
Security Threat Identification and TestingSecurity Threat Identification and Testing
Security Threat Identification and Testing
Annibale Panichella
 
Reformulating Branch Coverage as a Many-Objective Optimization Problem
Reformulating Branch Coverage as a Many-Objective Optimization ProblemReformulating Branch Coverage as a Many-Objective Optimization Problem
Reformulating Branch Coverage as a Many-Objective Optimization Problem
Annibale Panichella
 
Results for EvoSuite-MOSA at the Third Unit Testing Tool Competition
Results for EvoSuite-MOSA at the Third Unit Testing Tool CompetitionResults for EvoSuite-MOSA at the Third Unit Testing Tool Competition
Results for EvoSuite-MOSA at the Third Unit Testing Tool Competition
Annibale Panichella
 
Adaptive User Feedback for IR-based Traceability Recovery
Adaptive User Feedback for IR-based Traceability RecoveryAdaptive User Feedback for IR-based Traceability Recovery
Adaptive User Feedback for IR-based Traceability Recovery
Annibale Panichella
 
Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...
Annibale Panichella
 
Estimating the Evolution Direction of Populations to Improve Genetic Algorithms
Estimating the Evolution Direction of Populations to Improve Genetic AlgorithmsEstimating the Evolution Direction of Populations to Improve Genetic Algorithms
Estimating the Evolution Direction of Populations to Improve Genetic Algorithms
Annibale Panichella
 

More from Annibale Panichella (20)

Breaking the Silence: the Threats of Using LLMs in Software Engineering
Breaking the Silence: the Threats of Using LLMs in Software EngineeringBreaking the Silence: the Threats of Using LLMs in Software Engineering
Breaking the Silence: the Threats of Using LLMs in Software Engineering
 
Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...
Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...
Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...
 
A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...
A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...
A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...
 
An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...
An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...
An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...
 
VST2022.pdf
VST2022.pdfVST2022.pdf
VST2022.pdf
 
IPA Fall Days 2019
 IPA Fall Days 2019 IPA Fall Days 2019
IPA Fall Days 2019
 
An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...
An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...
An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...
 
Speeding-up Software Testing With Computational Intelligence
Speeding-up Software Testing With Computational IntelligenceSpeeding-up Software Testing With Computational Intelligence
Speeding-up Software Testing With Computational Intelligence
 
Incremental Control Dependency Frontier Exploration for Many-Criteria Test C...
Incremental Control Dependency Frontier Exploration for Many-Criteria  Test C...Incremental Control Dependency Frontier Exploration for Many-Criteria  Test C...
Incremental Control Dependency Frontier Exploration for Many-Criteria Test C...
 
Sbst2018 contest2018
Sbst2018 contest2018Sbst2018 contest2018
Sbst2018 contest2018
 
Java Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth RoundJava Unit Testing Tool Competition — Fifth Round
Java Unit Testing Tool Competition — Fifth Round
 
ICSE 2017 - Evocrash
ICSE 2017 - EvocrashICSE 2017 - Evocrash
ICSE 2017 - Evocrash
 
Evolutionary Testing for Crash Reproduction
Evolutionary Testing for Crash ReproductionEvolutionary Testing for Crash Reproduction
Evolutionary Testing for Crash Reproduction
 
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
 
Security Threat Identification and Testing
Security Threat Identification and TestingSecurity Threat Identification and Testing
Security Threat Identification and Testing
 
Reformulating Branch Coverage as a Many-Objective Optimization Problem
Reformulating Branch Coverage as a Many-Objective Optimization ProblemReformulating Branch Coverage as a Many-Objective Optimization Problem
Reformulating Branch Coverage as a Many-Objective Optimization Problem
 
Results for EvoSuite-MOSA at the Third Unit Testing Tool Competition
Results for EvoSuite-MOSA at the Third Unit Testing Tool CompetitionResults for EvoSuite-MOSA at the Third Unit Testing Tool Competition
Results for EvoSuite-MOSA at the Third Unit Testing Tool Competition
 
Adaptive User Feedback for IR-based Traceability Recovery
Adaptive User Feedback for IR-based Traceability RecoveryAdaptive User Feedback for IR-based Traceability Recovery
Adaptive User Feedback for IR-based Traceability Recovery
 
Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...Diversity mechanisms for evolutionary populations in Search-Based Software En...
Diversity mechanisms for evolutionary populations in Search-Based Software En...
 
Estimating the Evolution Direction of Populations to Improve Genetic Algorithms
Estimating the Evolution Direction of Populations to Improve Genetic AlgorithmsEstimating the Evolution Direction of Populations to Improve Genetic Algorithms
Estimating the Evolution Direction of Populations to Improve Genetic Algorithms
 

Recently uploaded

Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Selcen Ozturkcan
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Sérgio Sacani
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Farming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptxFarming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptx
Frédéric Baudron
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
Shashank Shekhar Pandey
 
fermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptxfermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptx
ananya23nair
 
Alternate Wetting and Drying - Climate Smart Agriculture
Alternate Wetting and Drying - Climate Smart AgricultureAlternate Wetting and Drying - Climate Smart Agriculture
Alternate Wetting and Drying - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
Sciences of Europe
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdfHUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
Ritik83251
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
Scintica Instrumentation
 
Gadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdfGadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdf
PirithiRaju
 
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills MN
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
Tissue fluids_etiology_volume regulation_pressure.pptx
Tissue fluids_etiology_volume regulation_pressure.pptxTissue fluids_etiology_volume regulation_pressure.pptx
Tissue fluids_etiology_volume regulation_pressure.pptx
muralinath2
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
yourprojectpartner05
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 

Recently uploaded (20)

Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
Candidate young stellar objects in the S-cluster: Kinematic analysis of a sub...
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Farming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptxFarming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptx
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
 
fermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptxfermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptx
 
Alternate Wetting and Drying - Climate Smart Agriculture
Alternate Wetting and Drying - Climate Smart AgricultureAlternate Wetting and Drying - Climate Smart Agriculture
Alternate Wetting and Drying - Climate Smart Agriculture
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdfHUMAN EYE By-R.M Class 10 phy best digital notes.pdf
HUMAN EYE By-R.M Class 10 phy best digital notes.pdf
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
 
Gadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdfGadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdf
 
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
Tissue fluids_etiology_volume regulation_pressure.pptx
Tissue fluids_etiology_volume regulation_pressure.pptxTissue fluids_etiology_volume regulation_pressure.pptx
Tissue fluids_etiology_volume regulation_pressure.pptx
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 

MIP Award presentation at the IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) 2024

  • 1. MIP - Cross-Project Defect Prediction Models: l'Union Fait la Force Annibale Panichella Rocco Oliveto Andrea De Lucia 1
  • 4. Cross-Project Defect Prediction 4 File LOC LCOM DIT … Bug F1 76 0 2 … F2 101 1 12 … … … … … … … Project A (Source) AI Model |——Software Metrics——| Project B (Target) File LOC LCOM DIT … Bug F1 176 22 1 … ? F2 433 144 4 … ? … … … … … ? |———Same Metrics———| Training Test
  • 5. Friedman test + Nemenyi Liem and Panichella, Arxiv 2020 The Master Predictor? 5 “One Ring to rule them all” Some models seem to perform better than others Differences among libraries/ implementations
  • 6. A Deeper Analysis 6 Bayes Net AUC = 0.56 Logistic AUC = 0.49 If (dit<10) If (loc > 5) 1 0 … True False True False Decision Tree AUC = 0.57 4 BN DT Log 8 9 1 38 10 0 Number of true positives (identi fi ed bugs) for Ant v1.7
  • 7. L’Union Fait La Force (Unity Is Strength) 7
  • 8. Inspired by Peer-Reviewing 8 Manuscript Reviewer1 Reviewer2 Reviewer3 Strong Reject Strong Accept Strong Accept ACCEPT Discussion File If (dit<10) If (loc > 5) 1 0 … True False True False Model 1 Model 2 Model 3 Combined Model (CODEP) Final Decision
  • 9. Inspired by Peer-Reviewing 9 File If (dit<10) If (loc > 5) 1 0 … True False True False Model 1 Model 2 Model 3 Combined Model (CODEP) Final Decision Base models are diverse (diversity is the key like in peer-reviewing) Combination can be done with any base/meta model (e.g., Naive Bayes Net) CODEP takes as inputs the predicted scores (con fi dence values) and not the predicted labels (yes/no) of the base models
  • 10. Replicating the Study 10 Years Later 10 A quick demo…
  • 11. Replicating the Study 10 Years Later 11 Over multiple runs (random K-folds), CODEP shows statistically better performance than the base model Some models have larger results variance (less stable) than others AU C
  • 12. The Impact in the Community 12 “Some classi fi ers are more consistent in predicting defects than others.” Bowes et al., PROMISE 2015 “Despite similar predictive performance values for these four classi fi ers, each detects different sets of defects.” Similar results also showed by • Bowes et al., in SQJ 2017 • Chen et al., JSEP 2019
  • 13. 13 Other combination strategies: • Ensemble (Majority voting) • Ensemble Stacking (Different ML models) • Classi fi er selection (Adaptive per target project) The Impact in the Community 0 3 6 9 12 2016 2017 2018 2019 2020 2021 2022 # Follow-up papers on Ensemble methods (Directly citing CODEP) 150+ Papers on Google scholar on Ensemble and combination methods for Defect Prediction Majority voting is the best… Majority voting is weak… Bootstrap aggregation is the best ever… Voting average is the top-one
  • 14. 14 Further studies on the role of model diversity on combination performances • Petric et al., ESEM 2016 “Diversity […] is essential" • Chen et al., IEEETransactions or Reliability, 2023 • Sharma et al., JSS 2023 • … Combination of different ML models for federated learning (DSA 2023) Beyond defect prediction: • Malware detection • Effort estimation • GitHub PR prediction • … The Impact in the Community
  • 15. What About Current Trends in SE? 15
  • 16. The Hot-Topic Waves 16 Standard ML Time Research Interests Deep Learning LLMs Random Forest Every 2-4 years (wave) we jump on new techniques hoping to fi nd the “master ring” We often forget the free-lunch theorem in optimization and machine learning Even a stopped clock is right twice in a day (simple baseline are stronger than we think) We should focus on understanding when to use X orY rather than focusing on X vs.Y (l’union fait la force)
  • 17. MIP - Cross-Project Defect Prediction Models: l'Union Fait la Force Annibale Panichella Rocco Oliveto Andrea De Lucia 17