SlideShare a Scribd company logo
1 of 97
Download to read offline
Survey on
Software Defect Prediction
- PhD Qualifying Examination -
July 3, 2014
Jaechang Nam
Department of Computer Science and Engineering
HKUST
Outline
• Background
• Software Defect Prediction Approaches
– Simple metric and defect estimation models
– Complexity metrics and Fitting models
– Prediction models
– Just-In-Time Prediction Models
– Practical Prediction Models and Applications
– History Metrics from Software Repositories
– Cross-Project Defect Prediction and Feasibility
• Summary and Challenging Issues
2
Motivation
• General question of software defect
prediction
– Can we identify defect-prone entities (source
code file, binary, module, change,...) in advance?
• # of defects
• buggy or clean
• Why?
– Quality assurance for large software
(Akiyama@IFIP’71)
– Effective resource allocation
• Testing (Menzies@TSE`07)
• Code review (Rahman@FSE’11)
3
Ground Assumption
• The more complex, the more defect-
prone
4
Two Focuses on Defect
Prediction
• How much complex is software and its
process?
– Metrics
• How can we predict whether software has
defects?
– Models based on the metrics
5
Prediction Performance Goal
• Recall vs. Precision
• Strong predictor criteria
– 70% recall and 25% false positive rate
(Menzies@TSE`07)
– Precision, recall, accuracy ≥ 75%
(Zimmermann@FSE`09)
6
Outline
• Background
• Software Defect Prediction Approaches
– Simple metric and defect estimation models
– Complexity metrics and Fitting models
– Prediction models
– Just-In-Time Prediction Models
– Practical Prediction Models and Applications
– History Metrics from Software Repositories
– Cross-Project Defect Prediction and Feasibility
• Summary and Challenging Issues
7
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
MetricsModelsOthers
Identifying Defect-prone Entities
• Akiyama’s equation (Ajiyama@IFIP`71)
– # of defects = 4.86 + 0.018 * LOC (=Lines Of Code)
• 23 defects in 1 KLOC
• Derived from actual systems
• Limitation
– Only LOC is not enough to capture software
complexity
9
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Cyclomati
c Metric
Halstea
d
Metrics
MetricsModelsOthers
Complexity Metrics and Fitting
Models
• Cyclomatic complexity metrics (McCabe`76)
– “Logical complexity” of a program represented in
control flow graph
– V(G) = #edge – #node + 2
• Halstead complexity metrics (Halsted`77)
– Metrics based on # of operators and operands
– Volume = N * log2n
– # of defects = Volume / 3000
11
Complexity Metrics and Fitting
Models
• Limitation
– Do not capture complexity (amount) of change.
– Just fitting models but not prediction models in
most of studies conducted in 1970s and early
1980s
• Correlation analysis between metrics and # of defects
– By linear regression models
• Models were not validated for new entities (modules).
12
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Cyclomati
c Metric
Halstea
d
Metrics
Process
Metrics
MetricsModelsOthers
Prediction Model (Classification)
Regression Model
• Shen et al.’s empirical study (Shen@TSE`85)
– Linear regression model
– Validated on actual new modules
– Metrics
• Halstead, # of conditional statements
• Process metrics
– Delta of complexity metrics between two successive system versions
– Measures
• Between actual and predicted # of defects on new modules
– MRE (Mean magnitude of relative error)
» average of (D-D’)/D for all modules
• D: actual # of defects
• D’: predicted # of defects
» MRE = 0.48
14
Classification Model
• Discriminative analysis by Munson et al.
(Munson@TSE`92)
• Logistic regression
• High risk vs. low risk modules
• Metrics
– Halstead and Cyclomatic complexity metrics
• Measure
– Type I error: False positive rate
– Type II error: False negative rate
• Result
– Accuracy: 92% (6 misclassification out of 78 modules)
– Precision: 85%
– Recall: 73%
– F-measure: 88%
15
?
Defect Prediction Process
(Based on Machine Learning)
16
Classification /
Regression
Software
Archives
B
C
C
B
...
2
5
0
1
...
Instances with
metrics (features)
and labels
B
C
B
...
2
0
1
...
Training
Instances
(Preprocessing
)
Model
?
New instances
Generate
Instances
Build
a
model
Defect Prediction
(Based on Machine Learning)
• Limitations
– Limited resources for process metrics
• Error fix in unit testing phase was conducted
informally by an individual developer (no error
information available in this phase). (Shen@TSE`85)
– Existing metrics were not enough to capture
complexity of object-oriented (OO) programs.
– Helpful for quality assurance team but not for
individual developers
17
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
Process
Metrics
MetricsModelsOthers
Just-In-Time Prediction Model
Practical Model and
Applications
History
Metrics
CK Metrics
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
Just-In-Time Prediction Model
Practical Model and
Applications
Process
Metrics
MetricsModelsOthers
History
Metrics
CK Metrics
Risk Prediction of Software
Changes
(Mockus@BLTJ`00)
• Logistic regression
• Change metrics
– LOC added/deleted/modified
– Diffusion of change
– Developer experience
• Result
– Both false positive and false negative rate: 20% in
the best case
20
Risk Prediction of Software
Changes
(Mockus@BLTJ`00)
• Advantage
– Show the feasible model in practice
• Limitation
– Conducted 3 times per week
• Not fully Just-In-Time
– Validated on one commercial system (5ESS
switching system software)
21
BugCache (Kim@ICSE`07)
• Maintain defect-prone entities in a cache
• Approach
• Result
– Top 10% files account for 73-95% of defects on 7
systems
22
BugCache (Kim@ICSE`07)
• Advantages
– Cache can be updated quickly with less cost. (c.f. static
models based on machine learning)
– Just-In-Time: always available whenever QA teams want
to get the list of defect-prone entities
• Limitations
– Cache is not reusable for other software projects.
– Designed for QA teams
• Applicable only in a certain time point after a bunch of changes
(e.g., end of a sprint)
• Still limited for individual developers in development phase
23
Change Classification (Kim@TSE`08)
• Classification model based on SVM
• About 11,500 features
– Change metadata such as changed LOC, change count
– Complexity metrics
– Text features from change log messages, source code,
and file names
• Results
– 78% accuracy and 60% recall on average from 12 open-
source projects
24
Change Classification (Kim@TSE`08)
• Limitations
– Heavy model (11,500 features)
– Not validated on commercial software products.
25
Follow-up Studies
• Studies addressing limitations
– “Reducing Features to Improve Code Change-Based Bug
Prediction” (Shivaji@TSE`13)
• With less than 10% of all features, buggy F-measure is 21%
improved.
– “Software Change Classification using Hunk Metrics”
(Ferzund@ICSM`09)
• 27 hunk-level metrics for change classification
• 81% accuracy, 77% buggy hunk precision, and 67% buggy hunk
recall
– “A large-scale empirical study of just-in-time quality
assurance” (Kamei@TSE`13)
• 14 process metrics (mostly from Mockus`00)
• 68% accuracy, 64% recall on 11open-source and commercial
projects
– “An Empirical Study of Just-In-Time Defect Prediction
Using Cross-Project Models” (Fukushima@MSR`14)
• Median AUC: 0.72 26
Challenges of JIT model
• Practical validation is difficult
– Just 10-fold cross validation in current literature
– No validation on real scenario
• e.g., online machine learning
• Still difficult to review huge change
– Fine-grained prediction within a change
• e.g., Line-level prediction
27
Next Steps of Defect Prediction
1980s 1990s 2000s 2010s 2020s
Online Learning JIT Model
Prediction Model (Regression)
Prediction Model (Classification)
Just-In-Time Prediction Model
Process
Metrics
MetricsModelsOthers
Fine-grained
Prediction
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
Just-In-Time Prediction Model
Practical Model and
Applications
Process
Metrics
MetricsModelsOthers
History
Metrics
CK Metrics
Defect Prediction in Industry
• “Predicting the location and number of faults in
large software systems” (Ostrand@TSE`05)
– Two industrial systems
– Recall 86%
– 20% most fault-prone modules account for 62% faults
30
Case Study for Practical Model
• “Does Bug Prediction Support Human Developers?
Findings From a Google Case Study” (Lewis@ICSE`13)
– No identifiable change in developer behaviors after using
defect prediction model
• Required characteristics but very challenging
– Actionable messages / obvious reasoning
31
Next Steps of Defect Prediction
1980s 1990s 2000s 2010s 2020s
Actionable
Defect
Prediction
Prediction Model (Regression)
Prediction Model (Classification)
Just-In-Time Prediction Model
Practical Model and
Applications
Process
Metrics
MetricsModelsOthers
Evaluation Measure for Practical
Model
• Measure prediction performance based on
code review effort
• AUCEC (Area Under Cost Effectiveness Curve)
33
Percent of LOC
Percentofbugsfound
0
100%
100%
50%10%
M1
M2
Rahman@FSE`11, Bugcache for inspections: Hit or miss?
Practical Application
• What else can we do more with defect
prediction models?
– Test case selection on regression testing
(Engstrom@ICST`10)
– Prioritizing warnings from FindBugs
(Rahman@ICSE`14)
34
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
CK Metrics
Process
Metrics
MetricsModelsOthers
Practical Model and
Applications
Just-In-Time Prediction Model
History
Metrics
Representative OO Metrics
Metric Description
WMC Weighted Methods per Class (# of methods)
DIT Depth of Inheritance Tree ( # of ancestor classes)
NOC Number of Children
CBO Coupling between Objects (# of coupled classes)
RFC
Response for a class: WMC + # of methods called by the
class)
LCOM
Lack of Cohesion in Methods (# of "connected
components”)
36
• CK metrics (Chidamber&Kemerer@TSE`94)
• Prediction Performance of CK vs. code
(Basili@TSE`96)
– F-measure: 70% vs. 60%
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
CK Metrics
Process
Metrics
MetricsModelsOthers
Practical Model and
Applications
Just-In-Time Prediction Model
History
Metrics
Representative History Metrics
38
Name
# of
metrics
Metric
source
Citation
Relative code change churn 8 SW Repo.* Nagappan@ICSE`05
Change 17 SW Repo. Moser@ICSE`08
Change Entropy 1 SW Repo. Hassan@ICSE`09
Code metric churn
Code Entropy
2 SW Repo. D’Ambros@MSR`10
Popularity 5
Email
archive
Bacchelli@FASE`10
Ownership 4 SW Repo. Bird@FSE`11
Micro Interaction Metrics (MIM) 56 Mylyn Lee@FSE`11
* SW Repo. = version control system + issue tracking system
Representative History Metrics
• Advantage
– Better prediction performance than code metrics
39
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
Moser`08 Hassan`09 D'Ambros`10 Bachille`10 Bird`11 Lee`11
Performance Improvement
(all metrics vs. code complexity metrics)
(F-measure) (F-measure)(Absolute
prediction
error)
(Spearman
correlation)
(Spearman
correlation)
(Spearman
correlation*)
(*Bird`10’s results are from two metrics vs. code metrics, No comparison data in Nagappan`05)
Performance
Improvement
(%)
History Metrics
• Limitations
– History metrics do not extract particular program
characteristics such as developer social network,
component network, and anti-pattern.
– Noise data
• Bias in Bug-Fix Dataset(Bird@FSE`09)
– Not applicable for new projects and projects lacking in
historical data
40
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and
Applications
Universa
l Model
Process
Metrics
Cross-Project
Feasibility
MetricsModelsOthers
History
Metrics
Other Metrics
Noise
Reduction
Semi-
supervised/active
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and
Applications
Universa
l Model
Process
Metrics
Cross-Project
Feasibility
MetricsModelsOthers
History
Metrics
Other Metrics
Noise
Reduction
Semi-
supervised/active
Other Metrics
43
Name
# of
metrics
Metric
source
Citation
Component network 28
Binaries
(Windows
Server
2003)
Zimmermann@ICSE`0
8
Developer-Module network 9
SW Repo. +
Binaries
Pinzger@FSE`08
Developer social network 4 SW Repo. Meenely@FSE`08
Anti-pattern 4
SW Repo. +
Design-
pattern
Taba@ICSM`13
* SW Repo. = version control system + issue tracking system
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and
Applications
Universa
l Model
Process
Metrics
Cross-Project
Feasibility
MetricsModelsOthers
History
Metrics
Other Metrics
Noise
Reduction
Semi-
supervised/active
Noise Reduction
• Noise detection and elimination algorithm
(Kim@ICSE`11)
– Closest List Noise Identification (CLNI)
• Based on Euclidean distance between instances
– Average F-measure improvement
• 0.504  0.621
• Relink (Wo@FSE`11)
– Recover missing links between bugs and
changes
– 60%  78% recall for missing links
– F-measure improvement
• e.g. 0.698 (traditional)  0.731 (ReLink)
45
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and
Applications
Universa
l Model
Process
Metrics
Cross-Project
Feasibility
MetricsModelsOthers
History
Metrics
Other Metrics
Semi-
supervised/active
Defect Prediction for New Software
Projects
• Universal Defect Prediction Model
• Simi-supervised / active learning
• Cross-Project Defect Prediction
47
Universal Defect Prediction Model
(Zhang@MSR`14)
• Context-aware rank transformation
– Transform metric values ranged from 1 to 10 across all
projects.
• Model built by 1398 projects collected from
SourceForge and Google code
48
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and
Applications
Universa
l Model
Process
Metrics
Cross-Project
Feasibility
MetricsModelsOthers
History
Metrics
Other Metrics
Semi-
supervised/active
Other approaches for CDDP
• Semi-supervised learning with dimension
reduction for defect prediction (Lu@ASE`12)
– Training a model by a small set of labeled
instances together with many unlabeled
instances
– AUC improvement
• 0.83  0.88 with 2% labeled instances
• Sample-based semi-supervised/active
learning for defect prediction (Li@AESEJ`12)
– Average F-measure
• 0.628  0.685 with 10% sampled instances
50
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and
Applications
Universa
l Model
Process
Metrics
Cross-Project
Feasibility
MetricsModelsOthers
History
Metrics
Other Metrics
Semi-
supervised/active
Cross-Project Defect Prediction
(CPDP)
• For a new project or a project lacking
in the historical data
52
?
?
?
Training
Test
Model
Project A Project B
Only 2% out of 622 prediction combinations worked. (Zimmermann@FSE`09)
Transfer Learning (TL)
27
Traditional Machine Learning
(ML)
Learnin
g
System
Learnin
g
System
Transfer Learning
Learnin
g
System
Learnin
g
System
Knowledge
Transfer
Pan et al.@TNN`10, Domain Adaptation via Transfer Component Analysis
CPDP
54
• Adopting transfer learning
Transfer learning
Metric
Compensation
NN Filter TNB TCA+
Preprocessing N/A
Feature selection,
Log-filter
Log-filter Normalization
Machine learner C4.5 Naive Bayes TNB Logistic Regression
# of Subjects 2 10 10 8
# of predictions 2 10 10 26
Avg. f-measure
0.67
(W:0.79, C:0.58)
0.35
(W:0.37, C:0.26)
0.39
(NN: 0.35, C:0.33)
0.46
(W:0.46, C:0.36)
Citation
Watanabe@PROMISE
`08
Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13
* NN = Nearest neighbor, W = Within, C = Cross
Metric Compensation
(Watanabe@PROMISE`08)
• Key idea
• New target metric value =
target metric value * average source metric value
average target metric value
55
s
Source Target New Target
Metric Compensation (cont.)
(Watanabe@PROMISE`08)
56
Transfer learning
Metric
Compensation
NN Filter TNB TCA+
Preprocessing N/A
Feature selection,
Log-filter
Log-filter Normalization
Machine learner C4.5 Naive Bayes TNB Logistic Regression
# of Subjects 2 10 10 8
# of predictions 2 10 10 26
Avg. f-measure
0.67
(W:0.79, C:0.58)
0.35
(W:0.37, C:0.26)
0.39
(NN: 0.35, C:0.33)
0.46
(W:0.46, C:0.36)
Citation
Watanabe@PROMISE
`08
Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13
* NN = Nearest neighbor, W = Within, C = Cross
NN filter
(Turhan@ESEJ`09)
• Key idea
• Nearest neighbor filter
– Select 10 nearest source instances of
each target instance
57
New Source Target
Hey, you look like me! Could you be my model?
Source
NN filter (cont.)
(Turhan@ESEJ`09)
58
Transfer learning
Metric
Compensation
NN Filter TNB TCA+
Preprocessing N/A
Feature selection,
Log-filter
Log-filter Normalization
Machine learner C4.5 Naive Bayes TNB Logistic Regression
# of Subjects 2 10 10 8
# of predictions 2 10 10 26
Avg. f-measure
0.67
(W:0.79, C:0.58)
0.35
(W:0.37, C:0.26)
0.39
(NN: 0.35, C:0.33)
0.46
(W:0.46, C:0.36)
Citation
Watanabe@PROMISE
`08
Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13
* NN = Nearest neighbor, W = Within, C = Cross
Transfer Naive Bayes
(Ma@IST`12)
• Key idea
59
Target
Hey, you look like me! You will get more chance to be my best
model!
Source
 Provide more weight to similar source instances to build a Naive Bayes Model
Build a model
Please, consider me more important than other
instances
Transfer Naive Bayes (cont.)
(Ma@IST`12)
• Transfer Naive Bayes
– New prior probability
– New conditional probability
60
Transfer Naive Bayes (cont.)
(Ma@IST`12)
• How to find similar source instances for target
– A similarity score
– A weight value
61
F1 F2 F3 F4 Score (si)
Max of target 7 3 2 5 -
src. inst 1 5 4 2 2 3
src. inst 2 0 2 5 9 1
Min of target 1 2 0 1 -
k=# of features, si=score of instance i
Transfer Naive Bayes (cont.)
(Ma@IST`12)
62
Transfer learning
Metric
Compensation
NN Filter TNB TCA+
Preprocessing N/A
Feature selection,
Log-filter
Log-filter Normalization
Machine learner C4.5 Naive Bayes TNB Logistic Regression
# of Subjects 2 10 10 8
# of predictions 2 10 10 26
Avg. f-measure
0.67
(W:0.79, C:0.58)
0.35
(W:0.37, C:0.26)
0.39
(NN: 0.35, C:0.33)
0.46
(W:0.46, C:0.36)
Citation
Watanabe@PROMISE
`08
Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13
* NN = Nearest neighbor, W = Within, C = Cross
TCA+
(Nam@ICSE`13)
• Key idea
– TCA (Transfer Component Analysis)
63
Source Target
Oops, we are different! Let’s meet in another world!
New Source New Target
Transfer Component Analysis (cont.)
• Feature extraction approach
– Dimensionality reduction
– Projection
• Map original data
in a lower-dimensional feature space
64
1-dimensional feature
space
2-dimensional feature
space
TCA (cont.)
65
Pan et al.@TNN`10, Domain Adaptation via Transfer Component Analysis
Target domain data
Source domain data
TCA (cont.)
66
TCA
Pan et al.@TNN`10, Domain Adaptation via Transfer Component Analysis
TCA+
(Nam@ICSE`13)
67
Source Target
Oops, we are different! Let’s meet at another world!
New Source New Target
But, we are still a bit different!
Source Target
Oops, we are different! Let’s meet at another world!
New Source New Target
Normalize US together!
TCA
TCA+
Normalization Options
• NoN: No normalization applied
• N1: Min-max normalization (max=1, min=0)
• N2: Z-score normalization (mean=0, std=1)
• N3: Z-score normalization only using source mean
and standard deviation
• N4: Z-score normalization only using target mean
and standard deviation
13
Preliminary Results using TCA
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
F-measure
69*Baseline: Cross-project defect prediction without TCA and normalization
Prediction performance of TCA
varies according to different
normalization options!
Baseline NoN N1 N2 N3
N4
Baseline NoN N1 N2 N3
N4
Project A  Project B Project B  Project A
F-measure
TCA+: Decision Rules
• Find a suitable normalization for TCA
• Steps
– #1: Characterize a dataset
– #2: Measure similarity
between source and target datasets
– #3: Decision rules
70
TCA+: #1. Characterize a
Dataset
71
3
1
…
Dataset A Dataset B
2
4
5
8
9
6
11
d1,2
d1,5
d1,3
d3,11
3
1
…
2
4
5
8
9
6
11
d2,6
d1,2
d1,3
d3,11
DIST={dij : i,j, 1 ≤ i < n, 1 < j ≤ n, i
< j}
A
TCA+: #2. Measure Similarity
between Source and Target
• Minimum (min) and maximum (max) values of
DIST
• Mean and standard deviation (std) of DIST
• The number of instances
72
TCA+: #3. Decision Rules
• Rule #1
– Mean and Std are same  NoN
• Rule #2
– Max and Min are different  N1 (max=1, min=0)
• Rule #3,#4
– Std and # of instances are different
 N3 or N4 (src/tgt mean=0, std=1)
• Rule #5
– Default  N2 (mean=0, std=1)
73
TCA+ (cont.)
(Nam@ICSE`13)
74
Transfer learning
Metric
Compensation
NN Filter TNB TCA+
Preprocessing N/A
Feature selection,
Log-filter
Log-filter Normalization
Machine learner C4.5 Naive Bayes TNB Logistic Regression
# of Subjects 2 10 10 8
# of predictions 2 10 10 26
Avg. f-measure
0.67
(W:0.79, C:0.58)
0.35
(W:0.37, C:0.26)
0.39
(NN: 0.35, C:0.33)
0.46
(W:0.46, C:0.36)
Citation
Watanabe@PROMISE
`08
Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13
* NN = Nearest neighbor, W = Within, C = Cross
Current CPDP using TL
• Advantages
– Comparable prediction performance to within-prediction
models
– Benefit from the state-of-the-art TL approaches
• Limitation
– Performance of some cross-prediction pairs is still poor.
(Negative Transfer)
75
Source Target
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and
Applications
Universa
l Model
Process
Metrics
Cross-Project
Feasibility
MetricsModelsOthers
History
Metrics
Other Metrics
Semi-
supervised/active
Feasibility Evaluation for CPDP
• Solution for negative transfer
– Decision tree using project characteristic metrics
(Zimmermann@FSE`09)
• E.g. programming language, # developers, etc.
77
Follow-up Studies
• “An investigation on the feasibility of cross-project
defect prediction.” (He@ASEJ`12)
– Decision tree using distributional characteristics of a
dataset E.g. mean, skewness, peakedness, etc.
78
Feasibility for CPDP
• Challenges on current studies
– Decision trees were not evaluated properly.
• Just fitting model
– Low target prediction coverage
• 5 out of 34 target projects were feasible for cross-
predictions (He@ASEJ`12)
79
Next Steps of Defect Prediction
1980s 1990s 2000s 2010s 2020s
Cross-Prediction
Feasibility Model
Prediction Model (Regression)
Prediction Model (Classification)
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and
Applications
Universa
l Model
Process
Metrics
Cross-Project
Feasibility
MetricsModelsOthers
History
Metrics
Other Metrics
Semi-
supervised/active
Semi-
supervised/active
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
CK Metrics
History
Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Other Metrics
Practical Model and
Applications
Universa
l Model
Process
Metrics
Cross-Project
Feasibility
MetricsModelsOthers
Personalized
Model
Cross-prediction Model
• Common challenge
– Current cross-prediction models are limited to datasets
with same number of metrics
– Not applicable on projects with different feature spaces
(different domains)
• NASA Dataset: Halstead, LOC
• Apache Dataset: LOC, Cyclomatic, CK metrics
82
Source Target
Next Steps of Defect Prediction
1980s 1990s 2000s 2010s 2020s
Prediction Model (Regression)
Prediction Model (Classification)
CK Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Practical Model and
Applications
Universa
l Model
Process
Metrics
Cross-Project
Feasibility
MetricsModelsOthers
Cross-Domain
Prediction
History Metrics
Other Metrics
Noise
Reduction
Semi-
supervised/activePersonalized
Model
Other Topics
84
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
CK Metrics
History Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Other Metrics
Practical Model and
Applications
Universa
l Model
Process
Metrics
Cross-Project
Feasibility
MetricsModelsOthers
Data Privacy
Noise
Reduction
Semi-
supervised/activePersonalized
Model
Other Topics
• Privacy issue on defect datasets
– MORPH (Peters@ICSE`12)
• Mutate defect datasets while keeping prediction accuracy
• Can accelerate cross-project defect prediction with
industrial datasets
• Personalized defect prediction model (Jiang@ASE`13)
– “Different developers have different coding styles,
commit frequencies, and experience levels, all of which
cause different defect patterns.”
– Results
• Average F-measure: 0.62 (personalized models) vs. 0.59 (non-
personalized models)
86
Outline
• Background
• Software Defect Prediction Approaches
– Simple metric and defect estimation models
– Complexity metrics and Fitting models
– Prediction models
– Just-In-Time Prediction Models
– Practical Prediction Models and Applications
– History Metrics from Software Repositories
– Cross-Project Defect Prediction and Feasibility
• Summary and Challenging Issues
87
Defect Prediction Approaches
1970s 1980s 1990s 2000s 2010s
LOC
Simple Model
Fitting Model
Prediction Model (Regression)
Prediction Model (Classification)
Cyclomati
c Metric
Halstea
d
Metrics
CK Metrics
History Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Other Metrics
Practical Model and
Applications
Data Privacy
Universa
l Model
Process
Metrics
Cross-Project
Feasibility
MetricsModelsOthers
Noise
Reduction
Semi-
supervised/activePersonalized
Model
Next Steps of Defect Prediction
1980s 1990s 2000s 2010s 2020s
Online Learning JIT Model
Actionable
Defect
Prediction
Cross-Prediction
Feasibility Model
Prediction Model (Regression)
Prediction Model (Classification)
CK Metrics
History Metrics
Just-In-Time Prediction Model
Cross-Project Prediction
Other Metrics
Practical Model and
Applications
Universa
l Model
Process
Metrics
Cross-Project
Feasibility
MetricsModelsOthers
Cross-Domain
Prediction
Fine-grained
Prediction
Data Privacy
Noise
Reduction
Semi-
supervised/activePersonalized
Model
Thank you!
90
91
Evaluation Measures
(classification)
• Measures for binary classification
– Confusion matrix
92
Buggy Clean
Buggy True Positive (TP) False Negative (FN)
Clean False Positive (FP) True Negatives (TN)
Predicted Class
Actual
Class
Evaluation Measures
(classification)
• False positive rate (FPR,PF) =
FP/(TN+FP)
• Accuracy = (TP+TN)/(TP+FP+TN+FN)
• Precision = TP/(TP+FP)
• Recall = TP/(TP+FN)
• F-measure =
2*Precision*Recall
Precision+Recall
93
Evaluation Measures
(classification)
• AUC (Area Under receiver operating characteristic
Curve)
94
False Positive rate
TruePositiverate
0
1
1
Evaluation Measures
(classification)
• AUCEC (Area Under Cost Effectiveness Curve)
95
Percent of LOC
Percentofbugsfound
0
100%
100%
50%10%
M1
M2
Rahman@FSE`11, Bugcache for inspections: Hit or miss?
Evaluation Measures
(Regression)
• Target
– Metric values vs. the number of bugs
– Actual vs. predicted number of bugs
• Correlation coefficient
– Spearman / Pearson /R2
• Mean squared error
96
CK metrics
Metric Description
WMC Weighted Methods per Class (# of methods)
DIT Depth of Inheritance Tree ( # of ancestor classes)
NOC Number of Children
CBO Coupling between Objects (# of coupled classes)
RFC
Response for a class: WMC + # of methods called by the
class)
LCOM
Lack of Cohesion in Methods (# of "connected
components”)
97

More Related Content

What's hot

Software Engineering Process Models
Software Engineering Process Models Software Engineering Process Models
Software Engineering Process Models Satya P. Joshi
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model佳蓉 倪
 
Continuous integration
Continuous integrationContinuous integration
Continuous integrationamscanne
 
Unified process model
Unified process modelUnified process model
Unified process modelRyndaMaala
 
DEVSECOPS: Coding DevSecOps journey
DEVSECOPS: Coding DevSecOps journeyDEVSECOPS: Coding DevSecOps journey
DEVSECOPS: Coding DevSecOps journeyJason Suttie
 
How to implement DevOps in your Organization
How to implement DevOps in your OrganizationHow to implement DevOps in your Organization
How to implement DevOps in your OrganizationDalibor Blazevic
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software EngineeringMiroslaw Staron
 
Software Configuration Management
Software Configuration ManagementSoftware Configuration Management
Software Configuration ManagementChandan Chaurasia
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term MemoryYan Xu
 
Ch13-Software Engineering 9
Ch13-Software Engineering 9Ch13-Software Engineering 9
Ch13-Software Engineering 9Ian Sommerville
 

What's hot (20)

Software process
Software processSoftware process
Software process
 
Software metrics
Software metricsSoftware metrics
Software metrics
 
Software Engineering Process Models
Software Engineering Process Models Software Engineering Process Models
Software Engineering Process Models
 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
 
Continuous integration
Continuous integrationContinuous integration
Continuous integration
 
Image captioning
Image captioningImage captioning
Image captioning
 
Unified process model
Unified process modelUnified process model
Unified process model
 
Ch02 process a generic view
Ch02 process a generic viewCh02 process a generic view
Ch02 process a generic view
 
DEVSECOPS: Coding DevSecOps journey
DEVSECOPS: Coding DevSecOps journeyDEVSECOPS: Coding DevSecOps journey
DEVSECOPS: Coding DevSecOps journey
 
DevOps introduction
DevOps introductionDevOps introduction
DevOps introduction
 
How to implement DevOps in your Organization
How to implement DevOps in your OrganizationHow to implement DevOps in your Organization
How to implement DevOps in your Organization
 
DevOps
DevOps DevOps
DevOps
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
Software Engineering
Software EngineeringSoftware Engineering
Software Engineering
 
Software Configuration Management
Software Configuration ManagementSoftware Configuration Management
Software Configuration Management
 
SQA-Plan.ppt
SQA-Plan.pptSQA-Plan.ppt
SQA-Plan.ppt
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
 
Rayleigh model
Rayleigh modelRayleigh model
Rayleigh model
 
intro to DevOps
intro to DevOpsintro to DevOps
intro to DevOps
 
Ch13-Software Engineering 9
Ch13-Software Engineering 9Ch13-Software Engineering 9
Ch13-Software Engineering 9
 

Similar to Survey on Software Defect Prediction

Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect Predictionlifove
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Lionel Briand
 
Bart Knaack - The Truth About Model-Based Quality Improvements
Bart Knaack - The Truth About Model-Based Quality ImprovementsBart Knaack - The Truth About Model-Based Quality Improvements
Bart Knaack - The Truth About Model-Based Quality ImprovementsTEST Huddle
 
Software Process Models
 Software Process Models  Software Process Models
Software Process Models MohsinAli773
 
SE_Unit 2.pdf it is a process model of it student
SE_Unit 2.pdf it is a process model of it studentSE_Unit 2.pdf it is a process model of it student
SE_Unit 2.pdf it is a process model of it studentRAVALCHIRAG1
 
Traditional Process Models
Traditional Process ModelsTraditional Process Models
Traditional Process ModelsAhsan Rahim
 
Software engineering jwfiles 3
Software engineering jwfiles 3Software engineering jwfiles 3
Software engineering jwfiles 3Azhar Shaik
 
Incremental Queries and Transformations for Engineering Critical Systems
Incremental Queries and Transformations for Engineering Critical SystemsIncremental Queries and Transformations for Engineering Critical Systems
Incremental Queries and Transformations for Engineering Critical SystemsÁkos Horváth
 
[2015/2016] Software development process
[2015/2016] Software development process[2015/2016] Software development process
[2015/2016] Software development processIvano Malavolta
 
2020 09-16-ai-engineering challanges
2020 09-16-ai-engineering challanges2020 09-16-ai-engineering challanges
2020 09-16-ai-engineering challangesIvica Crnkovic
 
Nesma autumn conference 2015 - Is FPA a valuable addition to predictable agil...
Nesma autumn conference 2015 - Is FPA a valuable addition to predictable agil...Nesma autumn conference 2015 - Is FPA a valuable addition to predictable agil...
Nesma autumn conference 2015 - Is FPA a valuable addition to predictable agil...Nesma
 
Software engineering lecture notes
Software engineering lecture notesSoftware engineering lecture notes
Software engineering lecture notesSiva Ayyakutti
 
Day 1 1620 - 1705 - maple - pranabendu bhattacharyya
Day 1   1620 - 1705 - maple - pranabendu bhattacharyyaDay 1   1620 - 1705 - maple - pranabendu bhattacharyya
Day 1 1620 - 1705 - maple - pranabendu bhattacharyyaPMI2011
 
Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02
Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02
Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02PMI_IREP_TP
 
Automated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsAutomated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsSAIL_QU
 
Soft engg introduction and process models
Soft engg introduction and process modelsSoft engg introduction and process models
Soft engg introduction and process modelssnehalkulkarni74
 

Similar to Survey on Software Defect Prediction (20)

Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect Prediction
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
 
Bart Knaack - The Truth About Model-Based Quality Improvements
Bart Knaack - The Truth About Model-Based Quality ImprovementsBart Knaack - The Truth About Model-Based Quality Improvements
Bart Knaack - The Truth About Model-Based Quality Improvements
 
Software Process Models
 Software Process Models  Software Process Models
Software Process Models
 
SE_Unit 2.pdf it is a process model of it student
SE_Unit 2.pdf it is a process model of it studentSE_Unit 2.pdf it is a process model of it student
SE_Unit 2.pdf it is a process model of it student
 
Metrics
MetricsMetrics
Metrics
 
Project Estimation
Project EstimationProject Estimation
Project Estimation
 
Traditional Process Models
Traditional Process ModelsTraditional Process Models
Traditional Process Models
 
Software engineering jwfiles 3
Software engineering jwfiles 3Software engineering jwfiles 3
Software engineering jwfiles 3
 
Incremental Queries and Transformations for Engineering Critical Systems
Incremental Queries and Transformations for Engineering Critical SystemsIncremental Queries and Transformations for Engineering Critical Systems
Incremental Queries and Transformations for Engineering Critical Systems
 
[2015/2016] Software development process
[2015/2016] Software development process[2015/2016] Software development process
[2015/2016] Software development process
 
2020 09-16-ai-engineering challanges
2020 09-16-ai-engineering challanges2020 09-16-ai-engineering challanges
2020 09-16-ai-engineering challanges
 
ppt2.pptx
ppt2.pptxppt2.pptx
ppt2.pptx
 
Nesma autumn conference 2015 - Is FPA a valuable addition to predictable agil...
Nesma autumn conference 2015 - Is FPA a valuable addition to predictable agil...Nesma autumn conference 2015 - Is FPA a valuable addition to predictable agil...
Nesma autumn conference 2015 - Is FPA a valuable addition to predictable agil...
 
Software engineering lecture notes
Software engineering lecture notesSoftware engineering lecture notes
Software engineering lecture notes
 
Day 1 1620 - 1705 - maple - pranabendu bhattacharyya
Day 1   1620 - 1705 - maple - pranabendu bhattacharyyaDay 1   1620 - 1705 - maple - pranabendu bhattacharyya
Day 1 1620 - 1705 - maple - pranabendu bhattacharyya
 
Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02
Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02
Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02
 
Developing Digital Twins
Developing Digital TwinsDeveloping Digital Twins
Developing Digital Twins
 
Automated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsAutomated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise Applications
 
Soft engg introduction and process models
Soft engg introduction and process modelsSoft engg introduction and process models
Soft engg introduction and process models
 

More from Sung Kim

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningSung Kim
 
Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Sung Kim
 
Time series classification
Time series classificationTime series classification
Time series classificationSung Kim
 
Tensor board
Tensor boardTensor board
Tensor boardSung Kim
 
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...Sung Kim
 
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Sung Kim
 
A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesSung Kim
 
Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Sung Kim
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSung Kim
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Sung Kim
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Sung Kim
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...Sung Kim
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)Sung Kim
 
Source code comprehension on evolving software
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving softwareSung Kim
 
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test GenerationSung Kim
 
MSR2014 opening
MSR2014 openingMSR2014 opening
MSR2014 openingSung Kim
 
Personalized Defect Prediction
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect PredictionSung Kim
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSung Kim
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learningSung Kim
 
Automatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesAutomatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesSung Kim
 

More from Sung Kim (20)

DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence LearningDeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
 
Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)
 
Time series classification
Time series classificationTime series classification
Time series classification
 
Tensor board
Tensor boardTensor board
Tensor board
 
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
 
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)Heterogeneous Defect Prediction (

ESEC/FSE 2015)
Heterogeneous Defect Prediction (

ESEC/FSE 2015)
 
A Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution TechniquesA Survey on Automatic Software Evolution Techniques
A Survey on Automatic Software Evolution Techniques
 
Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled Datasets
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
 
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
 
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
How We Get There: A Context-Guided Search Strategy in Concolic Testing (FSE 2...
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
 
Source code comprehension on evolving software
Source code comprehension on evolving softwareSource code comprehension on evolving software
Source code comprehension on evolving software
 
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
A Survey on  Dynamic Symbolic Execution  for Automatic Test GenerationA Survey on  Dynamic Symbolic Execution  for Automatic Test Generation
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
 
MSR2014 opening
MSR2014 openingMSR2014 opening
MSR2014 opening
 
Personalized Defect Prediction
Personalized Defect PredictionPersonalized Defect Prediction
Personalized Defect Prediction
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learning
 
Automatic patch generation learned from human written patches
Automatic patch generation learned from human written patchesAutomatic patch generation learned from human written patches
Automatic patch generation learned from human written patches
 

Recently uploaded

Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?SANGHEE SHIN
 
20200723_insight_release_plan
20200723_insight_release_plan20200723_insight_release_plan
20200723_insight_release_planJamie (Taka) Wang
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfAnna Loughnan Colquhoun
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncObject Automation
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IES VE
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServiceRenan Moreira de Oliveira
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 

Recently uploaded (20)

Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?Do we need a new standard for visualizing the invisible?
Do we need a new standard for visualizing the invisible?
 
20200723_insight_release_plan
20200723_insight_release_plan20200723_insight_release_plan
20200723_insight_release_plan
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Spring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdfSpring24-Release Overview - Wellingtion User Group-1.pdf
Spring24-Release Overview - Wellingtion User Group-1.pdf
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation Inc
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 

Survey on Software Defect Prediction

  • 1. Survey on Software Defect Prediction - PhD Qualifying Examination - July 3, 2014 Jaechang Nam Department of Computer Science and Engineering HKUST
  • 2. Outline • Background • Software Defect Prediction Approaches – Simple metric and defect estimation models – Complexity metrics and Fitting models – Prediction models – Just-In-Time Prediction Models – Practical Prediction Models and Applications – History Metrics from Software Repositories – Cross-Project Defect Prediction and Feasibility • Summary and Challenging Issues 2
  • 3. Motivation • General question of software defect prediction – Can we identify defect-prone entities (source code file, binary, module, change,...) in advance? • # of defects • buggy or clean • Why? – Quality assurance for large software (Akiyama@IFIP’71) – Effective resource allocation • Testing (Menzies@TSE`07) • Code review (Rahman@FSE’11) 3
  • 4. Ground Assumption • The more complex, the more defect- prone 4
  • 5. Two Focuses on Defect Prediction • How much complex is software and its process? – Metrics • How can we predict whether software has defects? – Models based on the metrics 5
  • 6. Prediction Performance Goal • Recall vs. Precision • Strong predictor criteria – 70% recall and 25% false positive rate (Menzies@TSE`07) – Precision, recall, accuracy ≥ 75% (Zimmermann@FSE`09) 6
  • 7. Outline • Background • Software Defect Prediction Approaches – Simple metric and defect estimation models – Complexity metrics and Fitting models – Prediction models – Just-In-Time Prediction Models – Practical Prediction Models and Applications – History Metrics from Software Repositories – Cross-Project Defect Prediction and Feasibility • Summary and Challenging Issues 7
  • 8. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model MetricsModelsOthers
  • 9. Identifying Defect-prone Entities • Akiyama’s equation (Ajiyama@IFIP`71) – # of defects = 4.86 + 0.018 * LOC (=Lines Of Code) • 23 defects in 1 KLOC • Derived from actual systems • Limitation – Only LOC is not enough to capture software complexity 9
  • 10. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Cyclomati c Metric Halstea d Metrics MetricsModelsOthers
  • 11. Complexity Metrics and Fitting Models • Cyclomatic complexity metrics (McCabe`76) – “Logical complexity” of a program represented in control flow graph – V(G) = #edge – #node + 2 • Halstead complexity metrics (Halsted`77) – Metrics based on # of operators and operands – Volume = N * log2n – # of defects = Volume / 3000 11
  • 12. Complexity Metrics and Fitting Models • Limitation – Do not capture complexity (amount) of change. – Just fitting models but not prediction models in most of studies conducted in 1970s and early 1980s • Correlation analysis between metrics and # of defects – By linear regression models • Models were not validated for new entities (modules). 12
  • 13. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Cyclomati c Metric Halstea d Metrics Process Metrics MetricsModelsOthers Prediction Model (Classification)
  • 14. Regression Model • Shen et al.’s empirical study (Shen@TSE`85) – Linear regression model – Validated on actual new modules – Metrics • Halstead, # of conditional statements • Process metrics – Delta of complexity metrics between two successive system versions – Measures • Between actual and predicted # of defects on new modules – MRE (Mean magnitude of relative error) » average of (D-D’)/D for all modules • D: actual # of defects • D’: predicted # of defects » MRE = 0.48 14
  • 15. Classification Model • Discriminative analysis by Munson et al. (Munson@TSE`92) • Logistic regression • High risk vs. low risk modules • Metrics – Halstead and Cyclomatic complexity metrics • Measure – Type I error: False positive rate – Type II error: False negative rate • Result – Accuracy: 92% (6 misclassification out of 78 modules) – Precision: 85% – Recall: 73% – F-measure: 88% 15
  • 16. ? Defect Prediction Process (Based on Machine Learning) 16 Classification / Regression Software Archives B C C B ... 2 5 0 1 ... Instances with metrics (features) and labels B C B ... 2 0 1 ... Training Instances (Preprocessing ) Model ? New instances Generate Instances Build a model
  • 17. Defect Prediction (Based on Machine Learning) • Limitations – Limited resources for process metrics • Error fix in unit testing phase was conducted informally by an individual developer (no error information available in this phase). (Shen@TSE`85) – Existing metrics were not enough to capture complexity of object-oriented (OO) programs. – Helpful for quality assurance team but not for individual developers 17
  • 18. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics Process Metrics MetricsModelsOthers Just-In-Time Prediction Model Practical Model and Applications History Metrics CK Metrics
  • 19. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics Just-In-Time Prediction Model Practical Model and Applications Process Metrics MetricsModelsOthers History Metrics CK Metrics
  • 20. Risk Prediction of Software Changes (Mockus@BLTJ`00) • Logistic regression • Change metrics – LOC added/deleted/modified – Diffusion of change – Developer experience • Result – Both false positive and false negative rate: 20% in the best case 20
  • 21. Risk Prediction of Software Changes (Mockus@BLTJ`00) • Advantage – Show the feasible model in practice • Limitation – Conducted 3 times per week • Not fully Just-In-Time – Validated on one commercial system (5ESS switching system software) 21
  • 22. BugCache (Kim@ICSE`07) • Maintain defect-prone entities in a cache • Approach • Result – Top 10% files account for 73-95% of defects on 7 systems 22
  • 23. BugCache (Kim@ICSE`07) • Advantages – Cache can be updated quickly with less cost. (c.f. static models based on machine learning) – Just-In-Time: always available whenever QA teams want to get the list of defect-prone entities • Limitations – Cache is not reusable for other software projects. – Designed for QA teams • Applicable only in a certain time point after a bunch of changes (e.g., end of a sprint) • Still limited for individual developers in development phase 23
  • 24. Change Classification (Kim@TSE`08) • Classification model based on SVM • About 11,500 features – Change metadata such as changed LOC, change count – Complexity metrics – Text features from change log messages, source code, and file names • Results – 78% accuracy and 60% recall on average from 12 open- source projects 24
  • 25. Change Classification (Kim@TSE`08) • Limitations – Heavy model (11,500 features) – Not validated on commercial software products. 25
  • 26. Follow-up Studies • Studies addressing limitations – “Reducing Features to Improve Code Change-Based Bug Prediction” (Shivaji@TSE`13) • With less than 10% of all features, buggy F-measure is 21% improved. – “Software Change Classification using Hunk Metrics” (Ferzund@ICSM`09) • 27 hunk-level metrics for change classification • 81% accuracy, 77% buggy hunk precision, and 67% buggy hunk recall – “A large-scale empirical study of just-in-time quality assurance” (Kamei@TSE`13) • 14 process metrics (mostly from Mockus`00) • 68% accuracy, 64% recall on 11open-source and commercial projects – “An Empirical Study of Just-In-Time Defect Prediction Using Cross-Project Models” (Fukushima@MSR`14) • Median AUC: 0.72 26
  • 27. Challenges of JIT model • Practical validation is difficult – Just 10-fold cross validation in current literature – No validation on real scenario • e.g., online machine learning • Still difficult to review huge change – Fine-grained prediction within a change • e.g., Line-level prediction 27
  • 28. Next Steps of Defect Prediction 1980s 1990s 2000s 2010s 2020s Online Learning JIT Model Prediction Model (Regression) Prediction Model (Classification) Just-In-Time Prediction Model Process Metrics MetricsModelsOthers Fine-grained Prediction
  • 29. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics Just-In-Time Prediction Model Practical Model and Applications Process Metrics MetricsModelsOthers History Metrics CK Metrics
  • 30. Defect Prediction in Industry • “Predicting the location and number of faults in large software systems” (Ostrand@TSE`05) – Two industrial systems – Recall 86% – 20% most fault-prone modules account for 62% faults 30
  • 31. Case Study for Practical Model • “Does Bug Prediction Support Human Developers? Findings From a Google Case Study” (Lewis@ICSE`13) – No identifiable change in developer behaviors after using defect prediction model • Required characteristics but very challenging – Actionable messages / obvious reasoning 31
  • 32. Next Steps of Defect Prediction 1980s 1990s 2000s 2010s 2020s Actionable Defect Prediction Prediction Model (Regression) Prediction Model (Classification) Just-In-Time Prediction Model Practical Model and Applications Process Metrics MetricsModelsOthers
  • 33. Evaluation Measure for Practical Model • Measure prediction performance based on code review effort • AUCEC (Area Under Cost Effectiveness Curve) 33 Percent of LOC Percentofbugsfound 0 100% 100% 50%10% M1 M2 Rahman@FSE`11, Bugcache for inspections: Hit or miss?
  • 34. Practical Application • What else can we do more with defect prediction models? – Test case selection on regression testing (Engstrom@ICST`10) – Prioritizing warnings from FindBugs (Rahman@ICSE`14) 34
  • 35. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics CK Metrics Process Metrics MetricsModelsOthers Practical Model and Applications Just-In-Time Prediction Model History Metrics
  • 36. Representative OO Metrics Metric Description WMC Weighted Methods per Class (# of methods) DIT Depth of Inheritance Tree ( # of ancestor classes) NOC Number of Children CBO Coupling between Objects (# of coupled classes) RFC Response for a class: WMC + # of methods called by the class) LCOM Lack of Cohesion in Methods (# of "connected components”) 36 • CK metrics (Chidamber&Kemerer@TSE`94) • Prediction Performance of CK vs. code (Basili@TSE`96) – F-measure: 70% vs. 60%
  • 37. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics CK Metrics Process Metrics MetricsModelsOthers Practical Model and Applications Just-In-Time Prediction Model History Metrics
  • 38. Representative History Metrics 38 Name # of metrics Metric source Citation Relative code change churn 8 SW Repo.* Nagappan@ICSE`05 Change 17 SW Repo. Moser@ICSE`08 Change Entropy 1 SW Repo. Hassan@ICSE`09 Code metric churn Code Entropy 2 SW Repo. D’Ambros@MSR`10 Popularity 5 Email archive Bacchelli@FASE`10 Ownership 4 SW Repo. Bird@FSE`11 Micro Interaction Metrics (MIM) 56 Mylyn Lee@FSE`11 * SW Repo. = version control system + issue tracking system
  • 39. Representative History Metrics • Advantage – Better prediction performance than code metrics 39 0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% Moser`08 Hassan`09 D'Ambros`10 Bachille`10 Bird`11 Lee`11 Performance Improvement (all metrics vs. code complexity metrics) (F-measure) (F-measure)(Absolute prediction error) (Spearman correlation) (Spearman correlation) (Spearman correlation*) (*Bird`10’s results are from two metrics vs. code metrics, No comparison data in Nagappan`05) Performance Improvement (%)
  • 40. History Metrics • Limitations – History metrics do not extract particular program characteristics such as developer social network, component network, and anti-pattern. – Noise data • Bias in Bug-Fix Dataset(Bird@FSE`09) – Not applicable for new projects and projects lacking in historical data 40
  • 41. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics CK Metrics Just-In-Time Prediction Model Cross-Project Prediction Practical Model and Applications Universa l Model Process Metrics Cross-Project Feasibility MetricsModelsOthers History Metrics Other Metrics Noise Reduction Semi- supervised/active
  • 42. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics CK Metrics Just-In-Time Prediction Model Cross-Project Prediction Practical Model and Applications Universa l Model Process Metrics Cross-Project Feasibility MetricsModelsOthers History Metrics Other Metrics Noise Reduction Semi- supervised/active
  • 43. Other Metrics 43 Name # of metrics Metric source Citation Component network 28 Binaries (Windows Server 2003) Zimmermann@ICSE`0 8 Developer-Module network 9 SW Repo. + Binaries Pinzger@FSE`08 Developer social network 4 SW Repo. Meenely@FSE`08 Anti-pattern 4 SW Repo. + Design- pattern Taba@ICSM`13 * SW Repo. = version control system + issue tracking system
  • 44. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics CK Metrics Just-In-Time Prediction Model Cross-Project Prediction Practical Model and Applications Universa l Model Process Metrics Cross-Project Feasibility MetricsModelsOthers History Metrics Other Metrics Noise Reduction Semi- supervised/active
  • 45. Noise Reduction • Noise detection and elimination algorithm (Kim@ICSE`11) – Closest List Noise Identification (CLNI) • Based on Euclidean distance between instances – Average F-measure improvement • 0.504  0.621 • Relink (Wo@FSE`11) – Recover missing links between bugs and changes – 60%  78% recall for missing links – F-measure improvement • e.g. 0.698 (traditional)  0.731 (ReLink) 45
  • 46. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics CK Metrics Just-In-Time Prediction Model Cross-Project Prediction Practical Model and Applications Universa l Model Process Metrics Cross-Project Feasibility MetricsModelsOthers History Metrics Other Metrics Semi- supervised/active
  • 47. Defect Prediction for New Software Projects • Universal Defect Prediction Model • Simi-supervised / active learning • Cross-Project Defect Prediction 47
  • 48. Universal Defect Prediction Model (Zhang@MSR`14) • Context-aware rank transformation – Transform metric values ranged from 1 to 10 across all projects. • Model built by 1398 projects collected from SourceForge and Google code 48
  • 49. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics CK Metrics Just-In-Time Prediction Model Cross-Project Prediction Practical Model and Applications Universa l Model Process Metrics Cross-Project Feasibility MetricsModelsOthers History Metrics Other Metrics Semi- supervised/active
  • 50. Other approaches for CDDP • Semi-supervised learning with dimension reduction for defect prediction (Lu@ASE`12) – Training a model by a small set of labeled instances together with many unlabeled instances – AUC improvement • 0.83  0.88 with 2% labeled instances • Sample-based semi-supervised/active learning for defect prediction (Li@AESEJ`12) – Average F-measure • 0.628  0.685 with 10% sampled instances 50
  • 51. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics CK Metrics Just-In-Time Prediction Model Cross-Project Prediction Practical Model and Applications Universa l Model Process Metrics Cross-Project Feasibility MetricsModelsOthers History Metrics Other Metrics Semi- supervised/active
  • 52. Cross-Project Defect Prediction (CPDP) • For a new project or a project lacking in the historical data 52 ? ? ? Training Test Model Project A Project B Only 2% out of 622 prediction combinations worked. (Zimmermann@FSE`09)
  • 53. Transfer Learning (TL) 27 Traditional Machine Learning (ML) Learnin g System Learnin g System Transfer Learning Learnin g System Learnin g System Knowledge Transfer Pan et al.@TNN`10, Domain Adaptation via Transfer Component Analysis
  • 54. CPDP 54 • Adopting transfer learning Transfer learning Metric Compensation NN Filter TNB TCA+ Preprocessing N/A Feature selection, Log-filter Log-filter Normalization Machine learner C4.5 Naive Bayes TNB Logistic Regression # of Subjects 2 10 10 8 # of predictions 2 10 10 26 Avg. f-measure 0.67 (W:0.79, C:0.58) 0.35 (W:0.37, C:0.26) 0.39 (NN: 0.35, C:0.33) 0.46 (W:0.46, C:0.36) Citation Watanabe@PROMISE `08 Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13 * NN = Nearest neighbor, W = Within, C = Cross
  • 55. Metric Compensation (Watanabe@PROMISE`08) • Key idea • New target metric value = target metric value * average source metric value average target metric value 55 s Source Target New Target
  • 56. Metric Compensation (cont.) (Watanabe@PROMISE`08) 56 Transfer learning Metric Compensation NN Filter TNB TCA+ Preprocessing N/A Feature selection, Log-filter Log-filter Normalization Machine learner C4.5 Naive Bayes TNB Logistic Regression # of Subjects 2 10 10 8 # of predictions 2 10 10 26 Avg. f-measure 0.67 (W:0.79, C:0.58) 0.35 (W:0.37, C:0.26) 0.39 (NN: 0.35, C:0.33) 0.46 (W:0.46, C:0.36) Citation Watanabe@PROMISE `08 Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13 * NN = Nearest neighbor, W = Within, C = Cross
  • 57. NN filter (Turhan@ESEJ`09) • Key idea • Nearest neighbor filter – Select 10 nearest source instances of each target instance 57 New Source Target Hey, you look like me! Could you be my model? Source
  • 58. NN filter (cont.) (Turhan@ESEJ`09) 58 Transfer learning Metric Compensation NN Filter TNB TCA+ Preprocessing N/A Feature selection, Log-filter Log-filter Normalization Machine learner C4.5 Naive Bayes TNB Logistic Regression # of Subjects 2 10 10 8 # of predictions 2 10 10 26 Avg. f-measure 0.67 (W:0.79, C:0.58) 0.35 (W:0.37, C:0.26) 0.39 (NN: 0.35, C:0.33) 0.46 (W:0.46, C:0.36) Citation Watanabe@PROMISE `08 Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13 * NN = Nearest neighbor, W = Within, C = Cross
  • 59. Transfer Naive Bayes (Ma@IST`12) • Key idea 59 Target Hey, you look like me! You will get more chance to be my best model! Source  Provide more weight to similar source instances to build a Naive Bayes Model Build a model Please, consider me more important than other instances
  • 60. Transfer Naive Bayes (cont.) (Ma@IST`12) • Transfer Naive Bayes – New prior probability – New conditional probability 60
  • 61. Transfer Naive Bayes (cont.) (Ma@IST`12) • How to find similar source instances for target – A similarity score – A weight value 61 F1 F2 F3 F4 Score (si) Max of target 7 3 2 5 - src. inst 1 5 4 2 2 3 src. inst 2 0 2 5 9 1 Min of target 1 2 0 1 - k=# of features, si=score of instance i
  • 62. Transfer Naive Bayes (cont.) (Ma@IST`12) 62 Transfer learning Metric Compensation NN Filter TNB TCA+ Preprocessing N/A Feature selection, Log-filter Log-filter Normalization Machine learner C4.5 Naive Bayes TNB Logistic Regression # of Subjects 2 10 10 8 # of predictions 2 10 10 26 Avg. f-measure 0.67 (W:0.79, C:0.58) 0.35 (W:0.37, C:0.26) 0.39 (NN: 0.35, C:0.33) 0.46 (W:0.46, C:0.36) Citation Watanabe@PROMISE `08 Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13 * NN = Nearest neighbor, W = Within, C = Cross
  • 63. TCA+ (Nam@ICSE`13) • Key idea – TCA (Transfer Component Analysis) 63 Source Target Oops, we are different! Let’s meet in another world! New Source New Target
  • 64. Transfer Component Analysis (cont.) • Feature extraction approach – Dimensionality reduction – Projection • Map original data in a lower-dimensional feature space 64 1-dimensional feature space 2-dimensional feature space
  • 65. TCA (cont.) 65 Pan et al.@TNN`10, Domain Adaptation via Transfer Component Analysis Target domain data Source domain data
  • 66. TCA (cont.) 66 TCA Pan et al.@TNN`10, Domain Adaptation via Transfer Component Analysis
  • 67. TCA+ (Nam@ICSE`13) 67 Source Target Oops, we are different! Let’s meet at another world! New Source New Target But, we are still a bit different! Source Target Oops, we are different! Let’s meet at another world! New Source New Target Normalize US together! TCA TCA+
  • 68. Normalization Options • NoN: No normalization applied • N1: Min-max normalization (max=1, min=0) • N2: Z-score normalization (mean=0, std=1) • N3: Z-score normalization only using source mean and standard deviation • N4: Z-score normalization only using target mean and standard deviation 13
  • 69. Preliminary Results using TCA 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 F-measure 69*Baseline: Cross-project defect prediction without TCA and normalization Prediction performance of TCA varies according to different normalization options! Baseline NoN N1 N2 N3 N4 Baseline NoN N1 N2 N3 N4 Project A  Project B Project B  Project A F-measure
  • 70. TCA+: Decision Rules • Find a suitable normalization for TCA • Steps – #1: Characterize a dataset – #2: Measure similarity between source and target datasets – #3: Decision rules 70
  • 71. TCA+: #1. Characterize a Dataset 71 3 1 … Dataset A Dataset B 2 4 5 8 9 6 11 d1,2 d1,5 d1,3 d3,11 3 1 … 2 4 5 8 9 6 11 d2,6 d1,2 d1,3 d3,11 DIST={dij : i,j, 1 ≤ i < n, 1 < j ≤ n, i < j} A
  • 72. TCA+: #2. Measure Similarity between Source and Target • Minimum (min) and maximum (max) values of DIST • Mean and standard deviation (std) of DIST • The number of instances 72
  • 73. TCA+: #3. Decision Rules • Rule #1 – Mean and Std are same  NoN • Rule #2 – Max and Min are different  N1 (max=1, min=0) • Rule #3,#4 – Std and # of instances are different  N3 or N4 (src/tgt mean=0, std=1) • Rule #5 – Default  N2 (mean=0, std=1) 73
  • 74. TCA+ (cont.) (Nam@ICSE`13) 74 Transfer learning Metric Compensation NN Filter TNB TCA+ Preprocessing N/A Feature selection, Log-filter Log-filter Normalization Machine learner C4.5 Naive Bayes TNB Logistic Regression # of Subjects 2 10 10 8 # of predictions 2 10 10 26 Avg. f-measure 0.67 (W:0.79, C:0.58) 0.35 (W:0.37, C:0.26) 0.39 (NN: 0.35, C:0.33) 0.46 (W:0.46, C:0.36) Citation Watanabe@PROMISE `08 Turhan@ESEJ`09 Ma@IST`12 Nam@ICSE`13 * NN = Nearest neighbor, W = Within, C = Cross
  • 75. Current CPDP using TL • Advantages – Comparable prediction performance to within-prediction models – Benefit from the state-of-the-art TL approaches • Limitation – Performance of some cross-prediction pairs is still poor. (Negative Transfer) 75 Source Target
  • 76. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics CK Metrics Just-In-Time Prediction Model Cross-Project Prediction Practical Model and Applications Universa l Model Process Metrics Cross-Project Feasibility MetricsModelsOthers History Metrics Other Metrics Semi- supervised/active
  • 77. Feasibility Evaluation for CPDP • Solution for negative transfer – Decision tree using project characteristic metrics (Zimmermann@FSE`09) • E.g. programming language, # developers, etc. 77
  • 78. Follow-up Studies • “An investigation on the feasibility of cross-project defect prediction.” (He@ASEJ`12) – Decision tree using distributional characteristics of a dataset E.g. mean, skewness, peakedness, etc. 78
  • 79. Feasibility for CPDP • Challenges on current studies – Decision trees were not evaluated properly. • Just fitting model – Low target prediction coverage • 5 out of 34 target projects were feasible for cross- predictions (He@ASEJ`12) 79
  • 80. Next Steps of Defect Prediction 1980s 1990s 2000s 2010s 2020s Cross-Prediction Feasibility Model Prediction Model (Regression) Prediction Model (Classification) CK Metrics Just-In-Time Prediction Model Cross-Project Prediction Practical Model and Applications Universa l Model Process Metrics Cross-Project Feasibility MetricsModelsOthers History Metrics Other Metrics Semi- supervised/active
  • 81. Semi- supervised/active Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics CK Metrics History Metrics Just-In-Time Prediction Model Cross-Project Prediction Other Metrics Practical Model and Applications Universa l Model Process Metrics Cross-Project Feasibility MetricsModelsOthers Personalized Model
  • 82. Cross-prediction Model • Common challenge – Current cross-prediction models are limited to datasets with same number of metrics – Not applicable on projects with different feature spaces (different domains) • NASA Dataset: Halstead, LOC • Apache Dataset: LOC, Cyclomatic, CK metrics 82 Source Target
  • 83. Next Steps of Defect Prediction 1980s 1990s 2000s 2010s 2020s Prediction Model (Regression) Prediction Model (Classification) CK Metrics Just-In-Time Prediction Model Cross-Project Prediction Practical Model and Applications Universa l Model Process Metrics Cross-Project Feasibility MetricsModelsOthers Cross-Domain Prediction History Metrics Other Metrics Noise Reduction Semi- supervised/activePersonalized Model
  • 85. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics CK Metrics History Metrics Just-In-Time Prediction Model Cross-Project Prediction Other Metrics Practical Model and Applications Universa l Model Process Metrics Cross-Project Feasibility MetricsModelsOthers Data Privacy Noise Reduction Semi- supervised/activePersonalized Model
  • 86. Other Topics • Privacy issue on defect datasets – MORPH (Peters@ICSE`12) • Mutate defect datasets while keeping prediction accuracy • Can accelerate cross-project defect prediction with industrial datasets • Personalized defect prediction model (Jiang@ASE`13) – “Different developers have different coding styles, commit frequencies, and experience levels, all of which cause different defect patterns.” – Results • Average F-measure: 0.62 (personalized models) vs. 0.59 (non- personalized models) 86
  • 87. Outline • Background • Software Defect Prediction Approaches – Simple metric and defect estimation models – Complexity metrics and Fitting models – Prediction models – Just-In-Time Prediction Models – Practical Prediction Models and Applications – History Metrics from Software Repositories – Cross-Project Defect Prediction and Feasibility • Summary and Challenging Issues 87
  • 88. Defect Prediction Approaches 1970s 1980s 1990s 2000s 2010s LOC Simple Model Fitting Model Prediction Model (Regression) Prediction Model (Classification) Cyclomati c Metric Halstea d Metrics CK Metrics History Metrics Just-In-Time Prediction Model Cross-Project Prediction Other Metrics Practical Model and Applications Data Privacy Universa l Model Process Metrics Cross-Project Feasibility MetricsModelsOthers Noise Reduction Semi- supervised/activePersonalized Model
  • 89. Next Steps of Defect Prediction 1980s 1990s 2000s 2010s 2020s Online Learning JIT Model Actionable Defect Prediction Cross-Prediction Feasibility Model Prediction Model (Regression) Prediction Model (Classification) CK Metrics History Metrics Just-In-Time Prediction Model Cross-Project Prediction Other Metrics Practical Model and Applications Universa l Model Process Metrics Cross-Project Feasibility MetricsModelsOthers Cross-Domain Prediction Fine-grained Prediction Data Privacy Noise Reduction Semi- supervised/activePersonalized Model
  • 91. 91
  • 92. Evaluation Measures (classification) • Measures for binary classification – Confusion matrix 92 Buggy Clean Buggy True Positive (TP) False Negative (FN) Clean False Positive (FP) True Negatives (TN) Predicted Class Actual Class
  • 93. Evaluation Measures (classification) • False positive rate (FPR,PF) = FP/(TN+FP) • Accuracy = (TP+TN)/(TP+FP+TN+FN) • Precision = TP/(TP+FP) • Recall = TP/(TP+FN) • F-measure = 2*Precision*Recall Precision+Recall 93
  • 94. Evaluation Measures (classification) • AUC (Area Under receiver operating characteristic Curve) 94 False Positive rate TruePositiverate 0 1 1
  • 95. Evaluation Measures (classification) • AUCEC (Area Under Cost Effectiveness Curve) 95 Percent of LOC Percentofbugsfound 0 100% 100% 50%10% M1 M2 Rahman@FSE`11, Bugcache for inspections: Hit or miss?
  • 96. Evaluation Measures (Regression) • Target – Metric values vs. the number of bugs – Actual vs. predicted number of bugs • Correlation coefficient – Spearman / Pearson /R2 • Mean squared error 96
  • 97. CK metrics Metric Description WMC Weighted Methods per Class (# of methods) DIT Depth of Inheritance Tree ( # of ancestor classes) NOC Number of Children CBO Coupling between Objects (# of coupled classes) RFC Response for a class: WMC + # of methods called by the class) LCOM Lack of Cohesion in Methods (# of "connected components”) 97

Editor's Notes

  1. Cross-project change classification Feasibility evaluation on cross-project defect prediction
  2. Predicting software quality Akiyama’s model is the earliest prediction model that predicts the number defects by using size of software such as LOC, # of subroutine calls. IFIC=International Federation of Information Processing Testing a entire system is not feasible. (Menzies`07) Inspecting source code is costly as well. (Rahman`11)
  3. complex software itself, complex development process, even developers solving complex problems can introdue bugs on software
  4. Depending on software, high recall might be more important than precision and vice versa.
  5. Cross-project change classification Feasibility evaluation on cross-project defect prediction
  6. V = N * log_2 n n = total # of distinct operands and operators N = total # of distinct operands and operators Correlation analysis using linear regression
  7. V = N * log_2 n n = total # of distinct operands and operators N = total # of distinct operands and operators Correlation analysis using linear regression
  8. Result from Command and Control Commutation System implanted in Ada Considered different thresholds for discriminative probability
  9. Diffusion of change: How many file/modules/subsystems were touched together? Developer experience: # of previous changes by the same developer. Weighted by considering contributions of the set of developers.
  10. Diffusion of change: How many file/modules/subsystems were touched together? Developer experience: # of previous changes by the same developer. Weighted by considering contributions of the set of developers.
  11. When a bug is found but not in the BugCache, this is a cache miss. Then, cache is updated with source code files based on locality. When the cache is full, the cache is replaced based on Least recently used policy that is used for common cache policy in operation system Based on cache hit or miss, update the cache Cache miss: entities fixed are not in cache  Load the entities and nearby entities (locality) to the cache Locality Files/functions changed together with defects Recently added files/functions Recently changed files/functions Cache replacement policy: Least Recently Used (LRU) weighted by # of previous defects.
  12. complexity metrics of old and new revision files and then compute delta between the old and new.
  13. complexity metrics of old and new revision files and then compute delta between the old and new.
  14. 10 metrics are from Mockus`10 Fukushima: cross-prediction performance can be improved
  15. As there was a transition from fitting model to prediction model, we need another transition from JIT prediction mode to Online learning based JIT prediction models.
  16. Are defect prediction models practical in industry?
  17. As there was a transition from fitting model to prediction model, we need another transition from JIT prediction mode to Online learning based JIT prediction models.
  18. Engstrom: found more defect selected by defect prediction results
  19. WMC: A class with more member methods than its peers is considered to be more complex and therefore more error prone. DIT: # of ancestor classes NOC: the number of direct descendants (subclasses) for each class CBO: RFC: LCOM: the number of "connected components" in a class Cohesion – 연관있는 메서드들은 한 클래안에 다 모아 넣기 high cohesion Coupling –
  20. Relative code change churn: e.g. churned LOC (the accumulative number of deleted and added lines between a base version and a new version of a source file) divided by Total LOC Change: e.g., # of revisions, # of authors editing a file Change Entropy: quantify complexity of changes by using Entropy theory. How many times changed in the same period. Code metric churn: churned metric value, collected biweekly basis. Code Entropy: how many lines of code changed in the same period? Popularity: source code files discussed a lot in emails Ownership: % of commits by a developer for a source code file MIM: How long does the source code file is edited.
  21. 10% recall improvement for Zimmermann’s approach (All vs. CM) 20% decrease of AIC, 200% increase of D^2 (Taba`10)
  22. Tested on 5 external projects Average Within F-measure: 0.42 Average Universal F-measure: 0.39 Average Within AUC: 0.72 Average Universal AUC: 0.72
  23. In traditional machine learning, we build a learning system by using instances in a same domain. In our case, the same domain means the same project. This is same as within-project defect prediction in this research. However, transfer learning is reusing knowledge from a certain domain which has enough data. Cross prediction is simply reusing learning system of a source project for a target project. However, in transfer learning, we just extract proper knowledge, which will be really helpful for the target domain. So, transfer learning algorithms play a role for smart knowledge transfer for the target domain. This is more than just a simple cross prediction. How to transfer knowledge from a source is a transfer learning algorithm!
  24. mass of a source instance = sM mass of test data = kmM kmM^2은 constant
  25. i = index of instance k = # of feautres s_i = similarity score of instance i m = # of instances M = mass of one feature in one instance source instance: s_i * M m * k * M mass of a source instance = sM mass of test data = kmM kmM^2은 constant
  26. Why cross results are different between NN and TNB? TNB doesn’t apply feature selection. TNB didn’t report within-results
  27. In machine leaning, there is a feature extraction approach to reduce feature space of data set. Feature extraction is achieved by a technique called projection. Projection technique maps original data in a low dimensional feature space. Here is an example of 2-dimensional feature space of a data set. There are four instances labeled. We project a light on this space to 1-dimensional space, and then four instances are mapped in the one-dimensional space. PCA is just for reducing feature space dimensionality. However, Transfer component analysis, TCA, try to find a new feature space where the distribution of source and target data sets are similar by projection. The representative technique is PCA. I’d like to show how PCA is different from TCA by an example.
  28. Here is an example showing how PCA and TCA works. In two-dimensional space, there are source and target data sets and we can see distributions are clearly different. If we apply PCA and TCA , and then we can get the following results in one-dimensional space.
  29. Probability density function Probability mass function In PCA, instances are projected into one dimensional space, however, distribution between source and target are still different. In TCA, all instances are also projected in one-dimensional space, where distribution between source and target is similar. Positive and negative instance of both training and test domains have discriminative power as shown in this figure. You can check detailed equations about this algorithm in this paper [add labels]
  30. Based on these normalization techniques, we defined several normalization options for defect prediction data sets. NI is min-max normalization which makes maximum and minimum value as 1 and 0 respectively. N2 is z-score normalization which makes mean and standard deviation as 0 and 1 respectively. We assume that some data sets may not have enough statistical information. So we defined variations of z-score normalization. To normalize both source and target data sets, N3 is only using mean and standard deviation from source data (when target data does not have enough statistical information. For example, lack of instances in a data set. N4 is only using target information for normalizing both source and target data sets.
  31. This is the preliminary results of some prediction combinations. Baseline means cross-project prediction without and normalization In Safe to Apache, all TCA results with or without normalization are better than baseline. However, in Apache to safe, N1, N3 didn’t outperform Baseline. This could be observed in other prediction combinations. So, we could conclude prediction performance of TCA varies according to different normalization options.
  32. TCA+ provides decision rules to select suitable normalization option. For the decision rules, we first characterize both source and target data sets to identify their difference. In the second step, we measure similarity between source and target data sets. With degree of similarity, we created decision rules!
  33. Then, how could we characterize data set? Here are two data sets. Intuitively, Data set A’s distribution is more sparser than data set B. To quantify this difference, we compute Euclidean distance of all pairs of instances in each data set. We defined DIST set for distances of all pairs. Likewise, we can get DIST set from Data set B.
  34. To measure similarity, we compute statistical parameters from DIST set such as minimum, maximum, mean, standard deviation, and the # of instances. With these information, we creted decision rules
  35. These are decision rules. If mean and std is same, we assume that distributions bewteen source and target is same. So we applied no normalization. For Rule2, if max and min values are different, we used N1(min-max normalization) for Rule3 and 4, we considered std and # of instances. If target information is not enough, then we used source mean and std to normalize both datasets. In case of Rule 5, if there are no rules are applicable, we applied N2 option, which make mean and std as 0 and 1 respectively.
  36. This decision tree shows precision in advance.
  37. Successful criteria Precision > 0.5 and Recall > 0.7
  38. As there was a transition from fitting model to prediction model, we need another transition from JIT prediction mode to Online learning based JIT prediction models.
  39. Assumed projects in the same group have the similar distribution Tested on 5 external projects Average Within F-measure: 0.42 Average Universal F-measure: 0.39 Average Within AUC: 0.72 Average Universal F-measure: 0.72
  40. As there was a transition from fitting model to prediction model, we need another transition from JIT prediction mode to Online learning based JIT prediction models.
  41. Cross-project change classification Feasibility evaluation on cross-project defect prediction
  42. As there was a transition from fitting model to prediction model, we need another transition from JIT prediction mode to Online learning based JIT prediction models.
  43. WMC: A class with more member methods than its peers is considered to be more complex and therefore more error prone. DIT: # of ancestor classes NOC: the number of direct descendants (subclasses) for each class CBO: RFC: LCOM: the number of "connected components" in a class Cohesion – 연관있는 메서드들은 한 클래안에 다 모아 넣기 high cohesion Coupling –