Heterogeneous Defect Prediction (

ESEC/FSE 2015)

Sung Kim
Sung KimAssociate Prof.
Heterogeneous Defect
Prediction
ESEC/FSE 2015
September 3, 2015
Jaechang Nam and Sunghun Kim
Department of Computer Science and Engineering
HKUST
2
Predict
Training
?
?
Model
Project A
: Metric value
: Buggy-labeled instance
: Clean-labeled instance
?: Unlabeled instance
Software Defect Prediction
Related Work
Munson@TSE`92, Basili@TSE`95, Menzies@TSE`07,
Hassan@ICSE`09, Bird@FSE`11,D’ambros@EMSE112
Lee@FSE`11,...
What if labeled instances do not
exist?
3
?
?
?
?
?
Project X
Unlabeled
Dataset
?: Unlabeled instance
: Metric value
Model
New projects
Projects lacking in
historical data
Existing Solutions?
4
?
?
?
?
?
(New) Project X
Unlabeled
Dataset
?: Unlabeled instance
: Metric value
Cross-Project Defect Prediction
(CPDP)
5
?
?
?
?
?
Training
Predict
Model
Project A
(source)
Project X
(target)
Unlabeled
Dataset
: Metric value
: Buggy-labeled instance
: Clean-labeled instance
?: Unlabeled instance
Related Work
Watanabe@PROMISE`08, Turhan@EMSE`09
Zimmermann@FSE`09, Ma@IST`12, Zhang@MSR`14
Zhang@MSR`14, Panichella@WCRE`14,
Canfora@STVR15
Challenge
Same metric set
(same feature space)
• Heterogeneous
metrics between
source and target
Motivation
6
?
Training
Test
Model
Project A
(source)
Project C
(target)
?
?
?
?
?
?
?
Heterogeneous metric sets
(different feature spaces
or different domains)
Possible to Reuse all the existing defect datasets for CPDP!
Heterogeneous Defect Prediction (HDP)
Key Idea
• Consistent defect-proneness tendency of
metrics
– Defect prediction metrics measure complexity of
software and its development process.
• e.g.
– The number of developers touching a source code file
(Bird@FSE`11)
– The number of methods in a class (D’Ambroas@ESEJ`12)
– The number of operands (Menzies@TSE`08)
More complexity implies more defect-proneness
(Rahman@ICSE`13)
• Distributions between source and target should
be the same to build a strong prediction model.
7
Match source and target metrics that
have similar distribution
Heterogeneous Defect Prediction (HDP)
- Overview -
8
X1 X2 X3 X4 Label
1 1 3 10 Buggy
8 0 1 0 Clean
⋮ ⋮ ⋮ ⋮ ⋮
9 0 1 1 Clean
Metric
Matching
Source: Project A Target: Project B
Cross-
prediction Model
Build
(training)
Predict
(test)
Metric
Selection
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Label
3 1 1 0 2 1 9 ?
1 1 9 0 2 3 8 ?
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
0 1 1 1 2 1 1 ?
1 3 10 Buggy
8 1 0 Clean
⋮ ⋮ ⋮ ⋮
9 1 1 Clean
1 3 10 Buggy
8 1 0 Clean
⋮ ⋮ ⋮ ⋮
9 1 1 Clean
9 1 1 ?
8 3 9 ?
⋮ ⋮ ⋮ ⋮
1 1 1 ?
Metric Selection
• Why? (Guyon@JMLR`03)
– Select informative metrics
• Remove redundant and irrelevant metrics
– Decrease complexity of metric matching combination
• Feature Selection Approaches (Gao@SPE`11,Shivaji@TSE`13)
– Gain Ratio
– Chi-square
– Relief-F
– Significance attribute evaluation
9
Metric Matching
10
Source Metrics Target Metrics
X1
X2
Y1
Y2
0.8
0.5
* We can apply different cutoff values of matching score
* It can be possible that there is no matching at all.
Compute Matching Score
KSAnalyzer
• Use p-value of Kolmogorov-Smirnov Test
(Massey@JASA`51)
11
Matching Score M of i-th source and j-th target metrics:
Mij = pij
Heterogeneous Defect Prediction
- Overview -
12
X1 X2 X3 X4 Label
1 1 3 10 Buggy
8 0 1 0 Clean
⋮ ⋮ ⋮ ⋮ ⋮
9 0 1 1 Clean
Metric
Matching
Source: Project A Target: Project B
Cross-
prediction Model
Build
(training)
Predict
(test)
Metric
Selection
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Label
3 1 1 0 2 1 9 ?
1 1 9 0 2 3 8 ?
⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
0 1 1 1 2 1 1 ?
1 3 10 Buggy
8 1 0 Clean
⋮ ⋮ ⋮ ⋮
9 1 1 Clean
1 3 10 Buggy
8 1 0 Clean
⋮ ⋮ ⋮ ⋮
9 1 1 Clean
9 1 1 ?
8 3 9 ?
⋮ ⋮ ⋮ ⋮
1 1 1 ?
EVALUATION
13
Baselines
• WPDP
• CPDP-CM (Turhan@EMSE`09,Ma@IST`12,He@IST`14)
– Cross-project defect prediction using only
common metrics between source and target
datasets
• CPDP-IFS (He@CoRR`14)
– Cross-project defect prediction on
Imbalanced Feature Set (i.e. heterogeneous
metric set)
– 16 distributional characteristics of values of
an instance as features (e.g., mean, std,
maximum,...)
14
Research Questions (RQs)
• RQ1
– Is heterogeneous defect prediction comparable
to WPDP?
• RQ2
– Is heterogeneous defect prediction comparable
to CPDP-CM?
• RQ3
– Is Heterogeneous defect prediction comparable
to CPDP-IFS?
15
Benchmark Datasets
Group Dataset
# of instances # of
metrics
Granularity
All Buggy (%)
AEEEM
EQ 325 129 (39.7%)
61 Class
JDT 997 206 (20.7%)
LC 399 64 (9.36%)
ML 1862 245 (13.2%)
PDE 1492 209 (14.0%)
MORP
H
ant-1.3 125 20 (16.0%)
20 Class
arc 234 27 (11.5%)
camel-1.0 339 13 (3.8%)
poi-1.5 237 141 (75.0%)
redaktor 176 27 (15.3%)
skarbonka 45 9 (20.0%)
tomcat 858 77 (9.0%)
velocity-1.4 196 147 (75.0%)
xalan-2.4 723 110 (15.2%)
xerces-1.2 440 71 (16.1%)
16
Group Dataset
# of instances # of
metrics
Granularity
All Buggy (%)
ReLink
Apache 194 98 (50.5%)
26 FileSafe 56 22 (39.3%)
ZXing 399
118
(29.6%)
NASA
cm1 327 42 (12.8%)
37 Function
mw1 253 27 (10.7%)
pc1 705 61 (8.7%)
pc3 1077
134
(12.4%)
pc4 1458
178
(12.2%)
SOFTLA
B
ar1 121 9 (7.4%)
29 Function
ar3 63 8 (12.7%)
ar4 107 20 (18.7%)
ar5 36 8 (22.2%)
ar6 101 15 (14.9%)
600 prediction combinations in total!
Experimental Settings
• Logistic Regression
• HDP vs. WPDP, CPDP-CM, and CPDP-IFS
17
Test set
(50%)
Training set
(50%)
Project
1
Project
2
Project
n
...
...
X 1000
Project
1
Project
2
Project
n
...
...
CPDP-CM
CPDP-IFS
HDP
WPDP
Project A
Evaluation Measures
• False Positive Rate = FP/(TN+FP)
• True Positive Rate = Recall
• AUC (Area Under receiver operating characteristic Curve)
18
False Positive rate
TruePositiverate
0
1
1
Evaluation Measures
• Win/Tie/Loss (Valentini@ICML`03, Li@JASE`12, Kocaguneli@TSE`13)
– Wilcoxon signed-rank test (p<0.05) for 1000
prediction results
– Win
• # of outperforming HDP prediction combinations with
statistical significance. (p<0.05)
– Tie
• # of HDP prediction combinations with no statistical
significance. (p≥0.05)
– Loss
• # of outperforming baseline prediction results with
statistical significance. (p>0.05)
19
RESULT
20
Prediction Results in median
AUC
Target WPDP
CPDP-
CM
CPDP-
IFS
HDPKS
(cutoff
=0.05)
EQ 0.583 0.776 0.461 0.783
JDT 0.795 0.781 0.543 0.767
MC 0.575 0.636 0.584 0.655
ML 0.734 0.651 0.557 0.692*
PDE 0.684 0.682 0.566 0.717
ant-1.3 0.670 0.611 0.500 0.701
arc 0.670 0.611 0.523 0.701
camel-1.0 0.550 0.590 0.500 0.639
poi-1.5 0.707 0.676 0.606 0.537
redaktor 0.744 0.500 0.500 0.537
skarbonka 0.569 0.736 0.528 0.694*
tomcat 0.778 0.746 0.640 0.818
velocity-
1.4
0.725 0.609 0.500 0.391
xalan-2.4 0.755 0.658 0.499 0.751
xerces-1.2 0.624 0.453 0.500 0.489
21
Target WPDP
CPDP-
CM
CPDP-
IFS
HDPKS
(cutoff
=0.05)
Apache 0.714 0.689 0.635 0.717*
Safe 0.706 0.749 0.616 0.818*
ZXing 0.605 0.619 0.530 0.650*
cm1 0.653 0.622 0.551 0.717*
mw1 0.612 0.584 0.614 0.727
pc1 0.787 0.675 0.564 0.752*
pc3 0.794 0.665 0.500 0.738*
pc4 0.900 0.773 0.589 0.682*
ar1 0.582 0.464 0.500 0.734*
ar3 0.574 0.862 0.682 0.823*
ar4 0.657 0.588 0.575 0.816*
ar5 0.804 0.875 0.585 0.911*
ar6 0.654 0.611 0.527 0.640
All 0.657 0.636 0.555 0.724*
HDPKS: Heterogeneous defect prediction using KSAnalyzer
Win/Tie/Loss Results
Target
Against
WPDP
Against
CPDP-CM
Against
CPDP-IFS
W T L W T L W T L
EQ 4 0 0 2 2 0 4 0 0
JDT 0 0 5 3 0 2 5 0 0
LC 6 0 1 3 3 1 3 1 3
ML 0 0 6 4 2 0 6 0 0
PDE 3 0 2 2 0 3 5 0 0
ant-1.3 6 0 1 6 0 1 5 0 2
arc 3 1 0 3 0 1 4 0 0
camel-1.0 3 0 2 3 0 2 4 0 1
poi-1.5 2 0 2 3 0 1 2 0 2
redaktor 0 0 4 2 0 2 3 0 1
skarbonka 11 0 0 4 0 7 9 0 2
tomcat 2 0 0 1 1 0 2 0 0
velocity-
1.4
0 0 3 0 0 3 0 0 3
xalan-2.4 0 0 1 1 0 0 1 0 0
xerces-1.2 0 0 3 3 0 0 1 0 2 22
Target
Against
WPDP
Against
CPDP-CM
Against
CPDP-IFS
W T L W T L W T L
Apach
e
6 0 5 8 1 2 9 0 2
Safe 14 0 3 12 0 5 15 0 2
ZXing 8 0 0 6 0 2 7 0 1
cm1 7 1 2 8 0 2 9 0 1
mw1 5 0 1 4 0 2 4 0 2
pc1 1 0 5 5 0 1 6 0 0
pc3 0 0 7 7 0 0 7 0 0
pc4 0 0 7 2 0 5 7 0 0
ar1 14 0 1 14 0 1 11 0 4
ar3 15 0 0 5 0 10 10 2 3
ar4 16 0 0 14 1 1 15 0 1
ar5 14 0 4 14 0 4 16 0 2
ar6 7 1 7 8 4 3 12 0 3
Total 147 3 72 147 14 61 182 3 35
%
66.2
%
1.4%
32.4
%
66.2
%
6.3%
27.5
%
82.0
%
1.3%
16.7
%
Matched Metrics (Win)
23
MetricValues
Distribution
(Source metric: RFC-the number of method invoked by a class, Target metric: the number of operand
Matching Score = 0.91
AUC = 0.946 (ant1.3  ar5)
Matched Metrics (Loss)
24
MetricValues
Distribution
(Source metric: LOC, Target metric: average number of LOC in a method)
Matching Score = 0.13
AUC = 0.391 (Safe  velocity-1.4)
Different Feature Selections
(median AUCs, Win/Tie/Loss)
25
Approach
Against
WPDP
Against
CPDP-CM
Against
CPDP-IFS
HDP
AUC Win% AUC Win% AUC Win% AUC
Gain Ratio 0.657 63.7% 0.645 63.2% 0.536 80.2% 0.720
Chi-Square 0.657 64.7% 0.651 66.4% 0.556 82.3% 0.727
Significanc
e
0.657 66.2% 0.636 66.2% 0.553 82.0% 0.724
Relief-F 0.670 57.0% 0.657 63.1% 0.543 80.5% 0.709
None 0.657 47.3% 0.624 50.3% 0.536 66.3% 0.663
Results in Different Cutoffs
26
Cutoff
Against
WPDP
Against
CPDP-CM
Against
CPDP-IFS
HDP Target
Coverage
AUC Win% AUC Win% AUC Win% AUC
0.05 0.657 66.2% 0.636 66.2% 0.553 82.4% 0.724* 100%
0.90 0.657 100% 0.761 71.4% 0.624 100% 0.852* 21%
Conclusion
• HDP
– Potential for CPDP across datasets with
different metric sets.
• Future work
– Filtering out noisy metric matching
– Determine the best probability threshold
27
Q&A
THANK YOU!
28
1 of 28

Recommended

Transfer defect learning by
Transfer defect learningTransfer defect learning
Transfer defect learningSung Kim
3.2K views52 slides
STAR: Stack Trace based Automatic Crash Reproduction by
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSung Kim
7K views99 slides
Dissertation Defense by
Dissertation DefenseDissertation Defense
Dissertation DefenseSung Kim
17K views87 slides
Software Defect Prediction on Unlabeled Datasets by
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSung Kim
16.7K views86 slides
Mining Assumptions for Software Components using Machine Learning by
Mining Assumptions for Software Components using Machine LearningMining Assumptions for Software Components using Machine Learning
Mining Assumptions for Software Components using Machine LearningLionel Briand
1.3K views47 slides
Artificial Intelligence for Automated Software Testing by
Artificial Intelligence for Automated Software TestingArtificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingLionel Briand
3.2K views147 slides

More Related Content

What's hot

AI in SE: A 25-year Journey by
AI in SE: A 25-year JourneyAI in SE: A 25-year Journey
AI in SE: A 25-year JourneyLionel Briand
2.2K views107 slides
A practical guide for using Statistical Tests to assess Randomized Algorithms... by
A practical guide for using Statistical Tests to assess Randomized Algorithms...A practical guide for using Statistical Tests to assess Randomized Algorithms...
A practical guide for using Statistical Tests to assess Randomized Algorithms...Lionel Briand
281 views36 slides
Testing of Cyber-Physical Systems: Diversity-driven Strategies by
Testing of Cyber-Physical Systems: Diversity-driven StrategiesTesting of Cyber-Physical Systems: Diversity-driven Strategies
Testing of Cyber-Physical Systems: Diversity-driven StrategiesLionel Briand
362 views41 slides
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D... by
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...Lionel Briand
829 views125 slides
Testing Machine Learning-enabled Systems: A Personal Perspective by
Testing Machine Learning-enabled Systems: A Personal PerspectiveTesting Machine Learning-enabled Systems: A Personal Perspective
Testing Machine Learning-enabled Systems: A Personal PerspectiveLionel Briand
1.2K views96 slides
Automated Program Repair Keynote talk by
Automated Program Repair Keynote talkAutomated Program Repair Keynote talk
Automated Program Repair Keynote talkAbhik Roychoudhury
5.7K views46 slides

What's hot(20)

AI in SE: A 25-year Journey by Lionel Briand
AI in SE: A 25-year JourneyAI in SE: A 25-year Journey
AI in SE: A 25-year Journey
Lionel Briand2.2K views
A practical guide for using Statistical Tests to assess Randomized Algorithms... by Lionel Briand
A practical guide for using Statistical Tests to assess Randomized Algorithms...A practical guide for using Statistical Tests to assess Randomized Algorithms...
A practical guide for using Statistical Tests to assess Randomized Algorithms...
Lionel Briand281 views
Testing of Cyber-Physical Systems: Diversity-driven Strategies by Lionel Briand
Testing of Cyber-Physical Systems: Diversity-driven StrategiesTesting of Cyber-Physical Systems: Diversity-driven Strategies
Testing of Cyber-Physical Systems: Diversity-driven Strategies
Lionel Briand362 views
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D... by Lionel Briand
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Lionel Briand829 views
Testing Machine Learning-enabled Systems: A Personal Perspective by Lionel Briand
Testing Machine Learning-enabled Systems: A Personal PerspectiveTesting Machine Learning-enabled Systems: A Personal Perspective
Testing Machine Learning-enabled Systems: A Personal Perspective
Lionel Briand1.2K views
Improving Fault Localization for Simulink Models using Search-Based Testing a... by Lionel Briand
Improving Fault Localization for Simulink Models using Search-Based Testing a...Improving Fault Localization for Simulink Models using Search-Based Testing a...
Improving Fault Localization for Simulink Models using Search-Based Testing a...
Lionel Briand241 views
Scalable Software Testing and Verification of Non-Functional Properties throu... by Lionel Briand
Scalable Software Testing and Verification of Non-Functional Properties throu...Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...
Lionel Briand478 views
Automated Test Suite Generation for Time-Continuous Simulink Models by Lionel Briand
Automated Test Suite Generation for Time-Continuous Simulink ModelsAutomated Test Suite Generation for Time-Continuous Simulink Models
Automated Test Suite Generation for Time-Continuous Simulink Models
Lionel Briand559 views
System Testing of Timing Requirements based on Use Cases and Timed Automata by Lionel Briand
System Testing of Timing Requirements based on Use Cases and Timed AutomataSystem Testing of Timing Requirements based on Use Cases and Timed Automata
System Testing of Timing Requirements based on Use Cases and Timed Automata
Lionel Briand650 views
Can we predict the quality of spectrum-based fault localization? by Lionel Briand
Can we predict the quality of spectrum-based fault localization?Can we predict the quality of spectrum-based fault localization?
Can we predict the quality of spectrum-based fault localization?
Lionel Briand284 views
A Survey on Automatic Software Evolution Techniques by Sung Kim
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution Techniques
Sung Kim1.1K views
Achieving Scalability in Software Testing with Machine Learning and Metaheuri... by Lionel Briand
Achieving Scalability in Software Testing with Machine Learning and Metaheuri...Achieving Scalability in Software Testing with Machine Learning and Metaheuri...
Achieving Scalability in Software Testing with Machine Learning and Metaheuri...
Lionel Briand258 views
A Search-based Testing Approach for XML Injection Vulnerabilities in Web Appl... by Lionel Briand
A Search-based Testing Approach for XML Injection Vulnerabilities in Web Appl...A Search-based Testing Approach for XML Injection Vulnerabilities in Web Appl...
A Search-based Testing Approach for XML Injection Vulnerabilities in Web Appl...
Lionel Briand587 views
Automated Testing of Autonomous Driving Assistance Systems by Lionel Briand
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
Lionel Briand1.2K views
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCL by Lionel Briand
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCLOCLR: A More Expressive, Pattern-Based Temporal Extension of OCL
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCL
Lionel Briand613 views
Search-driven String Constraint Solving for Vulnerability Detection by Lionel Briand
Search-driven String Constraint Solving for Vulnerability DetectionSearch-driven String Constraint Solving for Vulnerability Detection
Search-driven String Constraint Solving for Vulnerability Detection
Lionel Briand345 views
Metamorphic Security Testing for Web Systems by Lionel Briand
Metamorphic Security Testing for Web SystemsMetamorphic Security Testing for Web Systems
Metamorphic Security Testing for Web Systems
Lionel Briand441 views
Extracting Domain Models from Natural-Language Requirements: Approach and Ind... by Lionel Briand
Extracting Domain Models from Natural-Language Requirements: Approach and Ind...Extracting Domain Models from Natural-Language Requirements: Approach and Ind...
Extracting Domain Models from Natural-Language Requirements: Approach and Ind...
Lionel Briand411 views

Viewers also liked

REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria... by
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...Sung Kim
2.5K views16 slides
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014) by
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Sung Kim
1.9K views65 slides
Crowd debugging (FSE 2015) by
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Sung Kim
1.9K views33 slides
Source code comprehension on evolving software by
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving softwareSung Kim
1.6K views26 slides
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2... by
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...Sung Kim
2.2K views34 slides
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015) by
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Sung Kim
1.6K views24 slides

Viewers also liked(10)

REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria... by Sung Kim
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
Sung Kim2.5K views
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014) by Sung Kim
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Sung Kim1.9K views
Crowd debugging (FSE 2015) by Sung Kim
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)
Sung Kim1.9K views
Source code comprehension on evolving software by Sung Kim
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving software
Sung Kim1.6K views
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2... by Sung Kim
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
Sung Kim2.2K views
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015) by Sung Kim
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Sung Kim1.6K views
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014) by Sung Kim
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
Sung Kim6.4K views
Personalized Defect Prediction by Sung Kim
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect Prediction
Sung Kim3.7K views
Tensor board by Sung Kim
Tensor boardTensor board
Tensor board
Sung Kim8.4K views
Time series classification by Sung Kim
Time series classificationTime series classification
Time series classification
Sung Kim5.7K views

Similar to Heterogeneous Defect Prediction (

ESEC/FSE 2015)

A tale of experiments on bug prediction by
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug predictionMartin Pinzger
819 views36 slides
Heuristic design of experiments w meta gradient search by
Heuristic design of experiments w meta gradient searchHeuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient searchGreg Makowski
1.1K views44 slides
Artificial software diversity: automatic synthesis of program sosies by
Artificial software diversity: automatic synthesis of program sosiesArtificial software diversity: automatic synthesis of program sosies
Artificial software diversity: automatic synthesis of program sosiesFoCAS Initiative
367 views18 slides
Podem_Report by
Podem_ReportPodem_Report
Podem_ReportAnandhavel Nagendra
540 views7 slides
Towards Evaluating Size Reduction Techniques for Software Model Checking by
Towards Evaluating Size Reduction Techniques for Software Model CheckingTowards Evaluating Size Reduction Techniques for Software Model Checking
Towards Evaluating Size Reduction Techniques for Software Model CheckingAkos Hajdu
138 views21 slides
A Tale of Experiments on Bug Prediction by
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionMartin Pinzger
757 views45 slides

Similar to Heterogeneous Defect Prediction (

ESEC/FSE 2015)(20)

A tale of experiments on bug prediction by Martin Pinzger
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug prediction
Martin Pinzger819 views
Heuristic design of experiments w meta gradient search by Greg Makowski
Heuristic design of experiments w meta gradient searchHeuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient search
Greg Makowski1.1K views
Artificial software diversity: automatic synthesis of program sosies by FoCAS Initiative
Artificial software diversity: automatic synthesis of program sosiesArtificial software diversity: automatic synthesis of program sosies
Artificial software diversity: automatic synthesis of program sosies
FoCAS Initiative367 views
Towards Evaluating Size Reduction Techniques for Software Model Checking by Akos Hajdu
Towards Evaluating Size Reduction Techniques for Software Model CheckingTowards Evaluating Size Reduction Techniques for Software Model Checking
Towards Evaluating Size Reduction Techniques for Software Model Checking
Akos Hajdu138 views
A Tale of Experiments on Bug Prediction by Martin Pinzger
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug Prediction
Martin Pinzger757 views
Automation of building reliable models by Eszter Szabó
Automation of building reliable modelsAutomation of building reliable models
Automation of building reliable models
Eszter Szabó57 views
Variable Selection Methods by joycemi_la
Variable Selection MethodsVariable Selection Methods
Variable Selection Methods
joycemi_la212 views
Variable Selection Methods by joycemi_la
Variable Selection MethodsVariable Selection Methods
Variable Selection Methods
joycemi_la739 views
Comparing Machine Learning Algorithms in Text Mining by Andrea Gigli
Comparing Machine Learning Algorithms in Text MiningComparing Machine Learning Algorithms in Text Mining
Comparing Machine Learning Algorithms in Text Mining
Andrea Gigli1.6K views
DSUS_MAO_2012_Jie by MDO_Lab
DSUS_MAO_2012_JieDSUS_MAO_2012_Jie
DSUS_MAO_2012_Jie
MDO_Lab310 views
MuVM: Higher Order Mutation Analysis Virtual Machine for C by Susumu Tokumoto
MuVM: Higher Order Mutation Analysis Virtual Machine for CMuVM: Higher Order Mutation Analysis Virtual Machine for C
MuVM: Higher Order Mutation Analysis Virtual Machine for C
Susumu Tokumoto608 views
Dependability Benchmarking by Injecting Software Bugs by Roberto Natella
Dependability Benchmarking by Injecting Software BugsDependability Benchmarking by Injecting Software Bugs
Dependability Benchmarking by Injecting Software Bugs
Roberto Natella157 views
Deep_Learning__INAF_baroncelli.pdf by asdfasdf214078
Deep_Learning__INAF_baroncelli.pdfDeep_Learning__INAF_baroncelli.pdf
Deep_Learning__INAF_baroncelli.pdf
asdfasdf2140782 views
Human_Activity_Recognition_Predictive_Model by David Ritchie
Human_Activity_Recognition_Predictive_ModelHuman_Activity_Recognition_Predictive_Model
Human_Activity_Recognition_Predictive_Model
David Ritchie103 views

More from Sung Kim

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning by
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningSung Kim
1.3K views23 slides
Deep API Learning (FSE 2016) by
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Sung Kim
1.4K views25 slides
A Survey on Dynamic Symbolic Execution for Automatic Test Generation by
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test GenerationSung Kim
3.1K views65 slides
Survey on Software Defect Prediction by
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect PredictionSung Kim
14.1K views97 slides
MSR2014 opening by
MSR2014 openingMSR2014 opening
MSR2014 openingSung Kim
17K views16 slides
Automatic patch generation learned from human written patches by
Automatic patch generation learned from human written patchesAutomatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesSung Kim
9.2K views171 slides

More from Sung Kim(16)

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning by Sung Kim
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
Sung Kim1.3K views
Deep API Learning (FSE 2016) by Sung Kim
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)
Sung Kim1.4K views
A Survey on Dynamic Symbolic Execution for Automatic Test Generation by Sung Kim
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
Sung Kim3.1K views
Survey on Software Defect Prediction by Sung Kim
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect Prediction
Sung Kim14.1K views
MSR2014 opening by Sung Kim
MSR2014 openingMSR2014 opening
MSR2014 opening
Sung Kim17K views
Automatic patch generation learned from human written patches by Sung Kim
Automatic patch generation learned from human written patchesAutomatic patch generation learned from human written patches
Automatic patch generation learned from human written patches
Sung Kim9.2K views
The Anatomy of Developer Social Networks by Sung Kim
The Anatomy of Developer Social NetworksThe Anatomy of Developer Social Networks
The Anatomy of Developer Social Networks
Sung Kim835 views
A Survey on Automatic Test Generation and Crash Reproduction by Sung Kim
A Survey on Automatic Test Generation and Crash ReproductionA Survey on Automatic Test Generation and Crash Reproduction
A Survey on Automatic Test Generation and Crash Reproduction
Sung Kim2.1K views
How Do Software Engineers Understand Code Changes? FSE 2012 by Sung Kim
How Do Software Engineers Understand Code Changes? FSE 2012How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand Code Changes? FSE 2012
Sung Kim1.8K views
Defect, defect, defect: PROMISE 2012 Keynote by Sung Kim
Defect, defect, defect: PROMISE 2012 Keynote Defect, defect, defect: PROMISE 2012 Keynote
Defect, defect, defect: PROMISE 2012 Keynote
Sung Kim4.5K views
Predicting Recurring Crash Stacks (ASE 2012) by Sung Kim
Predicting Recurring Crash Stacks (ASE 2012)Predicting Recurring Crash Stacks (ASE 2012)
Predicting Recurring Crash Stacks (ASE 2012)
Sung Kim1.6K views
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz... by Sung Kim
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Puzzle-Based Automatic Testing: Bringing Humans Into the Loop by Solving Puzz...
Sung Kim1.8K views
Software Development Meets the Wisdom of Crowds by Sung Kim
Software Development Meets the Wisdom of CrowdsSoftware Development Meets the Wisdom of Crowds
Software Development Meets the Wisdom of Crowds
Sung Kim1.4K views
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009) by Sung Kim
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
BugTriage with Bug Tossing Graphs (ESEC/FSE 2009)
Sung Kim2.1K views
Self-defending software: Automatically patching errors in deployed software ... by Sung Kim
Self-defending software: Automatically patching  errors in deployed software ...Self-defending software: Automatically patching  errors in deployed software ...
Self-defending software: Automatically patching errors in deployed software ...
Sung Kim1.6K views
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008) by Sung Kim
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
ReCrash: Making crashes reproducible by preserving object states (ECOOP 2008)
Sung Kim1.7K views

Recently uploaded

The Era of Large Language Models.pptx by
The Era of Large Language Models.pptxThe Era of Large Language Models.pptx
The Era of Large Language Models.pptxAbdulVahedShaik
5 views9 slides
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx by
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptxanimuscrm
14 views19 slides
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs by
DSD-INT 2023 The Danube Hazardous Substances Model - KovacsDSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
DSD-INT 2023 The Danube Hazardous Substances Model - KovacsDeltares
8 views17 slides
Software evolution understanding: Automatic extraction of software identifier... by
Software evolution understanding: Automatic extraction of software identifier...Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...Ra'Fat Al-Msie'deen
7 views33 slides
Programming Field by
Programming FieldProgramming Field
Programming Fieldthehardtechnology
5 views9 slides
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI... by
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...Marc Müller
37 views83 slides

Recently uploaded(20)

2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx by animuscrm
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
2023-November-Schneider Electric-Meetup-BCN Admin Group.pptx
animuscrm14 views
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs by Deltares
DSD-INT 2023 The Danube Hazardous Substances Model - KovacsDSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
DSD-INT 2023 The Danube Hazardous Substances Model - Kovacs
Deltares8 views
Software evolution understanding: Automatic extraction of software identifier... by Ra'Fat Al-Msie'deen
Software evolution understanding: Automatic extraction of software identifier...Software evolution understanding: Automatic extraction of software identifier...
Software evolution understanding: Automatic extraction of software identifier...
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI... by Marc Müller
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Dev-Cloud Conference 2023 - Continuous Deployment Showdown: Traditionelles CI...
Marc Müller37 views
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J... by Deltares
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
DSD-INT 2023 3D hydrodynamic modelling of microplastic transport in lakes - J...
Deltares9 views
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge... by Deltares
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...
DSD-INT 2023 Delft3D FM Suite 2024.01 2D3D - New features + Improvements - Ge...
Deltares17 views
360 graden fabriek by info33492
360 graden fabriek360 graden fabriek
360 graden fabriek
info3349237 views
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t... by Deltares
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...
DSD-INT 2023 Thermobaricity in 3D DCSM-FM - taking pressure into account in t...
Deltares9 views
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema by Deltares
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - GeertsemaDSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
DSD-INT 2023 Delft3D FM Suite 2024.01 1D2D - Beta testing programme - Geertsema
Deltares17 views
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -... by Deltares
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...
DSD-INT 2023 Simulating a falling apron in Delft3D 4 - Engineering Practice -...
Deltares6 views
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with... by sparkfabrik
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...
20231129 - Platform @ localhost 2023 - Application-driven infrastructure with...
sparkfabrik5 views
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut... by Deltares
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
DSD-INT 2023 Machine learning in hydraulic engineering - Exploring unseen fut...
Deltares7 views
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ... by Deltares
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
DSD-INT 2023 Wave-Current Interaction at Montrose Tidal Inlet System and Its ...
Deltares10 views
Airline Booking Software by SharmiMehta
Airline Booking SoftwareAirline Booking Software
Airline Booking Software
SharmiMehta5 views
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ... by Donato Onofri
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Unmasking the Dark Art of Vectored Exception Handling: Bypassing XDR and EDR ...
Donato Onofri795 views
Copilot Prompting Toolkit_All Resources.pdf by Riccardo Zamana
Copilot Prompting Toolkit_All Resources.pdfCopilot Prompting Toolkit_All Resources.pdf
Copilot Prompting Toolkit_All Resources.pdf
Riccardo Zamana8 views
Tridens DevOps by Tridens
Tridens DevOpsTridens DevOps
Tridens DevOps
Tridens9 views

Heterogeneous Defect Prediction (

ESEC/FSE 2015)

  • 1. Heterogeneous Defect Prediction ESEC/FSE 2015 September 3, 2015 Jaechang Nam and Sunghun Kim Department of Computer Science and Engineering HKUST
  • 2. 2 Predict Training ? ? Model Project A : Metric value : Buggy-labeled instance : Clean-labeled instance ?: Unlabeled instance Software Defect Prediction Related Work Munson@TSE`92, Basili@TSE`95, Menzies@TSE`07, Hassan@ICSE`09, Bird@FSE`11,D’ambros@EMSE112 Lee@FSE`11,...
  • 3. What if labeled instances do not exist? 3 ? ? ? ? ? Project X Unlabeled Dataset ?: Unlabeled instance : Metric value Model New projects Projects lacking in historical data
  • 4. Existing Solutions? 4 ? ? ? ? ? (New) Project X Unlabeled Dataset ?: Unlabeled instance : Metric value
  • 5. Cross-Project Defect Prediction (CPDP) 5 ? ? ? ? ? Training Predict Model Project A (source) Project X (target) Unlabeled Dataset : Metric value : Buggy-labeled instance : Clean-labeled instance ?: Unlabeled instance Related Work Watanabe@PROMISE`08, Turhan@EMSE`09 Zimmermann@FSE`09, Ma@IST`12, Zhang@MSR`14 Zhang@MSR`14, Panichella@WCRE`14, Canfora@STVR15 Challenge Same metric set (same feature space) • Heterogeneous metrics between source and target
  • 6. Motivation 6 ? Training Test Model Project A (source) Project C (target) ? ? ? ? ? ? ? Heterogeneous metric sets (different feature spaces or different domains) Possible to Reuse all the existing defect datasets for CPDP! Heterogeneous Defect Prediction (HDP)
  • 7. Key Idea • Consistent defect-proneness tendency of metrics – Defect prediction metrics measure complexity of software and its development process. • e.g. – The number of developers touching a source code file (Bird@FSE`11) – The number of methods in a class (D’Ambroas@ESEJ`12) – The number of operands (Menzies@TSE`08) More complexity implies more defect-proneness (Rahman@ICSE`13) • Distributions between source and target should be the same to build a strong prediction model. 7 Match source and target metrics that have similar distribution
  • 8. Heterogeneous Defect Prediction (HDP) - Overview - 8 X1 X2 X3 X4 Label 1 1 3 10 Buggy 8 0 1 0 Clean ⋮ ⋮ ⋮ ⋮ ⋮ 9 0 1 1 Clean Metric Matching Source: Project A Target: Project B Cross- prediction Model Build (training) Predict (test) Metric Selection Y1 Y2 Y3 Y4 Y5 Y6 Y7 Label 3 1 1 0 2 1 9 ? 1 1 9 0 2 3 8 ? ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 0 1 1 1 2 1 1 ? 1 3 10 Buggy 8 1 0 Clean ⋮ ⋮ ⋮ ⋮ 9 1 1 Clean 1 3 10 Buggy 8 1 0 Clean ⋮ ⋮ ⋮ ⋮ 9 1 1 Clean 9 1 1 ? 8 3 9 ? ⋮ ⋮ ⋮ ⋮ 1 1 1 ?
  • 9. Metric Selection • Why? (Guyon@JMLR`03) – Select informative metrics • Remove redundant and irrelevant metrics – Decrease complexity of metric matching combination • Feature Selection Approaches (Gao@SPE`11,Shivaji@TSE`13) – Gain Ratio – Chi-square – Relief-F – Significance attribute evaluation 9
  • 10. Metric Matching 10 Source Metrics Target Metrics X1 X2 Y1 Y2 0.8 0.5 * We can apply different cutoff values of matching score * It can be possible that there is no matching at all.
  • 11. Compute Matching Score KSAnalyzer • Use p-value of Kolmogorov-Smirnov Test (Massey@JASA`51) 11 Matching Score M of i-th source and j-th target metrics: Mij = pij
  • 12. Heterogeneous Defect Prediction - Overview - 12 X1 X2 X3 X4 Label 1 1 3 10 Buggy 8 0 1 0 Clean ⋮ ⋮ ⋮ ⋮ ⋮ 9 0 1 1 Clean Metric Matching Source: Project A Target: Project B Cross- prediction Model Build (training) Predict (test) Metric Selection Y1 Y2 Y3 Y4 Y5 Y6 Y7 Label 3 1 1 0 2 1 9 ? 1 1 9 0 2 3 8 ? ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 0 1 1 1 2 1 1 ? 1 3 10 Buggy 8 1 0 Clean ⋮ ⋮ ⋮ ⋮ 9 1 1 Clean 1 3 10 Buggy 8 1 0 Clean ⋮ ⋮ ⋮ ⋮ 9 1 1 Clean 9 1 1 ? 8 3 9 ? ⋮ ⋮ ⋮ ⋮ 1 1 1 ?
  • 14. Baselines • WPDP • CPDP-CM (Turhan@EMSE`09,Ma@IST`12,He@IST`14) – Cross-project defect prediction using only common metrics between source and target datasets • CPDP-IFS (He@CoRR`14) – Cross-project defect prediction on Imbalanced Feature Set (i.e. heterogeneous metric set) – 16 distributional characteristics of values of an instance as features (e.g., mean, std, maximum,...) 14
  • 15. Research Questions (RQs) • RQ1 – Is heterogeneous defect prediction comparable to WPDP? • RQ2 – Is heterogeneous defect prediction comparable to CPDP-CM? • RQ3 – Is Heterogeneous defect prediction comparable to CPDP-IFS? 15
  • 16. Benchmark Datasets Group Dataset # of instances # of metrics Granularity All Buggy (%) AEEEM EQ 325 129 (39.7%) 61 Class JDT 997 206 (20.7%) LC 399 64 (9.36%) ML 1862 245 (13.2%) PDE 1492 209 (14.0%) MORP H ant-1.3 125 20 (16.0%) 20 Class arc 234 27 (11.5%) camel-1.0 339 13 (3.8%) poi-1.5 237 141 (75.0%) redaktor 176 27 (15.3%) skarbonka 45 9 (20.0%) tomcat 858 77 (9.0%) velocity-1.4 196 147 (75.0%) xalan-2.4 723 110 (15.2%) xerces-1.2 440 71 (16.1%) 16 Group Dataset # of instances # of metrics Granularity All Buggy (%) ReLink Apache 194 98 (50.5%) 26 FileSafe 56 22 (39.3%) ZXing 399 118 (29.6%) NASA cm1 327 42 (12.8%) 37 Function mw1 253 27 (10.7%) pc1 705 61 (8.7%) pc3 1077 134 (12.4%) pc4 1458 178 (12.2%) SOFTLA B ar1 121 9 (7.4%) 29 Function ar3 63 8 (12.7%) ar4 107 20 (18.7%) ar5 36 8 (22.2%) ar6 101 15 (14.9%) 600 prediction combinations in total!
  • 17. Experimental Settings • Logistic Regression • HDP vs. WPDP, CPDP-CM, and CPDP-IFS 17 Test set (50%) Training set (50%) Project 1 Project 2 Project n ... ... X 1000 Project 1 Project 2 Project n ... ... CPDP-CM CPDP-IFS HDP WPDP Project A
  • 18. Evaluation Measures • False Positive Rate = FP/(TN+FP) • True Positive Rate = Recall • AUC (Area Under receiver operating characteristic Curve) 18 False Positive rate TruePositiverate 0 1 1
  • 19. Evaluation Measures • Win/Tie/Loss (Valentini@ICML`03, Li@JASE`12, Kocaguneli@TSE`13) – Wilcoxon signed-rank test (p<0.05) for 1000 prediction results – Win • # of outperforming HDP prediction combinations with statistical significance. (p<0.05) – Tie • # of HDP prediction combinations with no statistical significance. (p≥0.05) – Loss • # of outperforming baseline prediction results with statistical significance. (p>0.05) 19
  • 21. Prediction Results in median AUC Target WPDP CPDP- CM CPDP- IFS HDPKS (cutoff =0.05) EQ 0.583 0.776 0.461 0.783 JDT 0.795 0.781 0.543 0.767 MC 0.575 0.636 0.584 0.655 ML 0.734 0.651 0.557 0.692* PDE 0.684 0.682 0.566 0.717 ant-1.3 0.670 0.611 0.500 0.701 arc 0.670 0.611 0.523 0.701 camel-1.0 0.550 0.590 0.500 0.639 poi-1.5 0.707 0.676 0.606 0.537 redaktor 0.744 0.500 0.500 0.537 skarbonka 0.569 0.736 0.528 0.694* tomcat 0.778 0.746 0.640 0.818 velocity- 1.4 0.725 0.609 0.500 0.391 xalan-2.4 0.755 0.658 0.499 0.751 xerces-1.2 0.624 0.453 0.500 0.489 21 Target WPDP CPDP- CM CPDP- IFS HDPKS (cutoff =0.05) Apache 0.714 0.689 0.635 0.717* Safe 0.706 0.749 0.616 0.818* ZXing 0.605 0.619 0.530 0.650* cm1 0.653 0.622 0.551 0.717* mw1 0.612 0.584 0.614 0.727 pc1 0.787 0.675 0.564 0.752* pc3 0.794 0.665 0.500 0.738* pc4 0.900 0.773 0.589 0.682* ar1 0.582 0.464 0.500 0.734* ar3 0.574 0.862 0.682 0.823* ar4 0.657 0.588 0.575 0.816* ar5 0.804 0.875 0.585 0.911* ar6 0.654 0.611 0.527 0.640 All 0.657 0.636 0.555 0.724* HDPKS: Heterogeneous defect prediction using KSAnalyzer
  • 22. Win/Tie/Loss Results Target Against WPDP Against CPDP-CM Against CPDP-IFS W T L W T L W T L EQ 4 0 0 2 2 0 4 0 0 JDT 0 0 5 3 0 2 5 0 0 LC 6 0 1 3 3 1 3 1 3 ML 0 0 6 4 2 0 6 0 0 PDE 3 0 2 2 0 3 5 0 0 ant-1.3 6 0 1 6 0 1 5 0 2 arc 3 1 0 3 0 1 4 0 0 camel-1.0 3 0 2 3 0 2 4 0 1 poi-1.5 2 0 2 3 0 1 2 0 2 redaktor 0 0 4 2 0 2 3 0 1 skarbonka 11 0 0 4 0 7 9 0 2 tomcat 2 0 0 1 1 0 2 0 0 velocity- 1.4 0 0 3 0 0 3 0 0 3 xalan-2.4 0 0 1 1 0 0 1 0 0 xerces-1.2 0 0 3 3 0 0 1 0 2 22 Target Against WPDP Against CPDP-CM Against CPDP-IFS W T L W T L W T L Apach e 6 0 5 8 1 2 9 0 2 Safe 14 0 3 12 0 5 15 0 2 ZXing 8 0 0 6 0 2 7 0 1 cm1 7 1 2 8 0 2 9 0 1 mw1 5 0 1 4 0 2 4 0 2 pc1 1 0 5 5 0 1 6 0 0 pc3 0 0 7 7 0 0 7 0 0 pc4 0 0 7 2 0 5 7 0 0 ar1 14 0 1 14 0 1 11 0 4 ar3 15 0 0 5 0 10 10 2 3 ar4 16 0 0 14 1 1 15 0 1 ar5 14 0 4 14 0 4 16 0 2 ar6 7 1 7 8 4 3 12 0 3 Total 147 3 72 147 14 61 182 3 35 % 66.2 % 1.4% 32.4 % 66.2 % 6.3% 27.5 % 82.0 % 1.3% 16.7 %
  • 23. Matched Metrics (Win) 23 MetricValues Distribution (Source metric: RFC-the number of method invoked by a class, Target metric: the number of operand Matching Score = 0.91 AUC = 0.946 (ant1.3  ar5)
  • 24. Matched Metrics (Loss) 24 MetricValues Distribution (Source metric: LOC, Target metric: average number of LOC in a method) Matching Score = 0.13 AUC = 0.391 (Safe  velocity-1.4)
  • 25. Different Feature Selections (median AUCs, Win/Tie/Loss) 25 Approach Against WPDP Against CPDP-CM Against CPDP-IFS HDP AUC Win% AUC Win% AUC Win% AUC Gain Ratio 0.657 63.7% 0.645 63.2% 0.536 80.2% 0.720 Chi-Square 0.657 64.7% 0.651 66.4% 0.556 82.3% 0.727 Significanc e 0.657 66.2% 0.636 66.2% 0.553 82.0% 0.724 Relief-F 0.670 57.0% 0.657 63.1% 0.543 80.5% 0.709 None 0.657 47.3% 0.624 50.3% 0.536 66.3% 0.663
  • 26. Results in Different Cutoffs 26 Cutoff Against WPDP Against CPDP-CM Against CPDP-IFS HDP Target Coverage AUC Win% AUC Win% AUC Win% AUC 0.05 0.657 66.2% 0.636 66.2% 0.553 82.4% 0.724* 100% 0.90 0.657 100% 0.761 71.4% 0.624 100% 0.852* 21%
  • 27. Conclusion • HDP – Potential for CPDP across datasets with different metric sets. • Future work – Filtering out noisy metric matching – Determine the best probability threshold 27

Editor's Notes

  1. Oggioni Room 17 PM (Session in 16:30 – 18:00)
  2. Here is Project A and some software entities. Let say these entities are source code files. I want to predict whether these files are buggy or clean. To do this, we need a prediction model. Since defect prediction models are trained by machine learning algorithms, we need labeled instances collected from previous releases. This is an labeled instance. An instance consists of features and labels. Various software metrics such as LoC, # of functions in a file, and # of authors touching a source file, are used as features for machine learning. Software metrics measure complexity of software and its development process Each instance can be labeled by past bug information. Software metrics and past bug information can be collected from software archives such as version control systems and bug report systems. With these labeled instances, we can build a prediction model and predict the unlabeled instances. This prediction is conducted within the same project. So, we call this Within-project defect prediction (WPDP). There are many studies about WPDP and showed good prediction performance. ( like prediction accuracy is 0.7.)
  3. What if there are no labeled instances. This can happen in new projects and projects lacking in historical data. New projects do not have past defect information to label instances. Some projects also does not have defect information because of lacking in historical data from software archives. When I participated in an industrial project for Samsung electronics, it was really difficult to generate labeled instances because their software archives are not well managed by developers. So, in some real industrial projects, we may not generate labeled instances to build a prediction model. Without labeled instances, we can not build a prediction model. After experiencing this limitation form the industry, I decided to address this problem.
  4. There are existing solutions to build a prediction model for unlabeled datasets. The first solution is cross-project defect prediction. We can reuse labeled instances from other projects.
  5. Various feature selection approaches can be applied
  6. By doing that, we can investigate how higher matching scores can impact defect prediction performance.
  7. 16 distribution characteristics: mode, median, mean, harmonic mean, minimum, maximum, range, variation ratio, first quartile, third quartile, interquartile range, vari- ance, standard deviation, coefficient of variance, skewness, and kurtosis
  8. AEEEM: object- oriented (OO) metrics, previous-defect metrics, entropy met- rics of change and code, and churn-of-source-code metrics [4]. MORPH: McCabe’s cyclomatic metrics, CK metrics, and other OO metrics [36]. ReLink: code complexity metrics NASA: Halstead metrics and McCabe’s cyclomatic metrics, additional complexity metrics such as parameter count and percentage of comments SOFTLAB: Halstead metrics and McCabe’s cyclomatic metrics
  9. all 222 prediction combinations among 600 predictions