SlideShare a Scribd company logo
Searching for Configurations
in Clone Evaluation:
A Replication Study
C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke
J. H. Drake
CENTRE FOR RESEARCH ON EVOLUTION, SEARCH AND TESTING
DEPARTMENT OF COMPUTER SCIENCE
UNIVERSITY COLLEGE LONDON
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Code Clone
2
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Clone Detectors
3
if (x==0) then y=y+1;
if (check==0) then count=count+1;
$p ($p==0) $p $p=$p+1;
$p ($p==0) $p $p=$p+1;
if_s
if ( cond_e ) then assign_e
if_s
if ( cond_e ) then assign_e
Deckard
CCFinder
Simian
NiCad
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Oracle Problem in Code Clone
Absence of the possibility to establish a ground truth, we do
not know if code is actually cloned
4
?
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Agreement
5
?
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Parameters Tuning
6
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
EvaClone
7
T. Wang, M. Harman., Y. Jia, & J. Krinke. Searching for Better
Configurations: A Rigorous Approach to Clone Evaluation. in FSE’13
6 Clone Detectors:
PMD, iClones
ConQAT, Simian,
NiCad, CCFinder
8 Software Projects:
weltab, cook, snns,
psql, javadoc, ant,
jdtcore, swing
15 years
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Maximising Agreement
8
C D N S
Maximise
Clone detectors
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
EvaClone (cont.)
9
EvaClone favors recall over precision 

and more candidates will be reported.
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Replication Study
10
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Fitness Function
11
4x3x2x1x ++ +
4 x (All clone lines)
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Replication Study (cont.)
12
Deckard
CCFinder
Simian
NiCad 25 parameters
Population size 100
No. of Generation 100
Crossover 0.8
Mutation 0.1
Elitism 0.25
2 x 1012
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
13
Ver. 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44
SLOC
(k)
5.5 6.7 6.78 6.82 7.2 7.6 8.4 8.9 10.1 12.4 17.9 22.8 23.6 25.3
%Inc N/A 21% 2% 1% 6% 5% 11% 7% 13% 23% 44% 28% 3% 8%
Note: there are 2 complete libraries (cglib and asm) embedded in release 1.5 — 1.9 and have been removed before the analysis
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
RQ1: Optimised Agreement
How do the default parameters perform in terms of
clone agreement on each Mockito release compared
to the optimised ones?
14
0.30
0.35
0.40
0.45
0.50
0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44
Mockito
FitnessValue
Default
EvaClone Highest
EvaClone Lowest
Comparison of optimised tools agreement (the highest and the lowest in 20 runs) to the default agreement over 14 Mockito releases
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
RQ2: Stability of Optimised Parameters
15
Are there noticeable differences in the values of
optimised parameters over releases?
Tool Parameter DF
Optimised
0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44
CCFinder
MinToken
TKS
50
12
10
10
70
16
70
18
70
19
80
18
80
18
80
19
80
20
10
14
10
17
10
10
10
10
10
10
10
10
Deckard
MinToken
Stride
Similarity
30
5
0.9
30
inf
0.9
50
8
1.0
50
8
1.0
50
8
1.0
50
8
1.0
50
8
1.0
50
8
1.0
50
8
1.0
50
16
0.95
50
5
1.0
50
inf
0.9
50
inf
0.9
50
inf
0.9
50
inf
0.9
NiCad
MinLine
MaxLine
UPI
Blind
Abstract
6
1K
0.3
0
0
5
200
0.3
1
4
7
100
0.0
0
6
7
100
0.1
0
6
7
400
0.0
0
6
6
400
0.0
0
6
6
200
0.1
0
5
6
200
0.1
0
5
7
200
0.0
1
6
6
200
0.3
1
6
5
100
0.1
1
2
5
100
0.3
1
4
5
100
0.3
1
4
5
200
0.3
1
4
5
200
0.3
1
4
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
RQ2: Stability of Optimised Parameters
16
Tool Parameter DF
Optimised
0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44
Simian
ignoreCurlyBraces 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0
ignoreIdentifiers 0 1 0 0 0 0 0 0 0 1 1 1 1 1 1
ignoreIdentifierCase 0 ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱
ignoreStrings 0 1 0 0 0 0 0 0 0 1 0 ✱ ✱ ✱ ✱
ignoreStringCase 1 ✱ 1 1 0 0 0 0 0 ✱ 0 ✱ ✱ ✱ ✱
ignoreNumbers 0 1 0 1 0 1 1 0 1 1 0 ✱ ✱ ✱ ✱
ignoreCharacters 0 0 0 1 0 0 0 1 0 0 1 ✱ ✱ ✱ ✱
ignoreCharacterCase 1 0 0 ✱ 1 1 0 ✱ 1 1 ✱ ✱ ✱ ✱ ✱
ignoreLiterals 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1
ignoreSubtypeNames 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1
ignoreModifiers 1 1 1 0 1 0 0 0 0 0 0 1 1 1 1
ignoreVariableNames 0 1 0 0 0 0 0 0 0 1 1 0 0 0 1
balanceParentheses 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0
balanceSquareBrackets 0 1 0 0 0 1 1 0 1 1 1 1 1 1 0
MinLine 6 5 6 6 6 6 6 6 6 7 7 5 5 5 5
Are there noticeable differences in the values of
optimised parameters over releases?
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
RQ3: Clones over Releases
17
How many clones in Mockito are reported with the
highest agreement over releases?
DefaultEvaClone
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Maximising Agreement
18
C D N S
Maximise
Clone detectors
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
Open Challenge
A better fitness function 

for EvaClone is needed
It must not only rely on the number of cloned
lines, but also include other aspects:
How often a line is found to be cloned to other
places?
Precision vs. Recall?
Location of clones
19
???
Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake
20
0.30
0.35
0.40
0.45
0.50
0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44
Mockito
FitnessValue
Default
EvaClone Highest
EvaClone Lowest
Opt. params vs Def. params
Tool Parameter
D
F
Optimised
0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10
2.
0.
0
2.
0.
44
CCFinder
MinToken
TKS
5
0
1
2
10
10
70
16
70
18
70
19
80
18
80
18
80
19
80
20
10
14
10
17
10
10
10
10
10
10
10
10
Deckard
MinToken
Stride
Similarity
3
0
5
0.
9
30
inf
0.
9
50
8
1.
0
50
8
1.
0
50
8
1.
0
50
8
1.
0
50
8
1.
0
50
8
1.
0
50
8
1.
0
50
16
0.
95
50
5
1.
0
50
inf
0.
9
50
inf
0.
9
50
inf
0.
9
50
inf
0.
9
NiCad
MinLine
MaxLine
UPI
Blind
Abstract
6
1
K
0.
3
0
0
5
20
0
0.
3
1
4
7
10
0
0.
0
0
6
7
10
0
0.
1
0
6
7
40
0
0.
0
0
6
6
40
0
0.
0
0
6
6
20
0
0.
1
0
5
6
20
0
0.
1
0
5
7
20
0
0.
0
1
6
6
20
0
0.
3
1
6
5
10
0
0.
1
1
2
5
10
0
0.
3
1
4
5
10
0
0.
3
1
4
5
20
0
0.
3
1
4
5
20
0
0.
3
1
4
Opt. params are not stable over releases
DefaultEvaClone
Fitness func. needs improvements

More Related Content

Viewers also liked

Sickle cell anemia
Sickle cell anemiaSickle cell anemia
Sickle cell anemia
Elijah Arras
 
Organelles & Diseases Related
Organelles & Diseases RelatedOrganelles & Diseases Related
Organelles & Diseases Relatedxoxositi
 
Biochemistry Honors
Biochemistry HonorsBiochemistry Honors
Biochemistry Honorscgales
 
State v. Mott: A Case Study in Forensic Science
State v. Mott: A Case Study in Forensic ScienceState v. Mott: A Case Study in Forensic Science
State v. Mott: A Case Study in Forensic Science
gcpolando
 
Case Study 2
Case Study 2Case Study 2
Chloroplast dna
Chloroplast dnaChloroplast dna
Chloroplast dna
AMRITHA K.T.K
 
Chemical Bonding
Chemical BondingChemical Bonding
Chemical BondingWMWatson
 
Case study on forensic audit
Case study on forensic auditCase study on forensic audit
Case study on forensic audit
SBS AND COMPANY LLP, CHARTERED ACCOUNTANTS
 
Sickle Cell Anemia
Sickle Cell AnemiaSickle Cell Anemia
Sickle Cell Anemiahafsamaryam
 
DNA Replication in eukaryotes and prokaryotes
DNA Replication in eukaryotes and prokaryotesDNA Replication in eukaryotes and prokaryotes
DNA Replication in eukaryotes and prokaryotes
Mohammad Barshan
 

Viewers also liked (13)

Sickle cell anemia
Sickle cell anemiaSickle cell anemia
Sickle cell anemia
 
Organelles & Diseases Related
Organelles & Diseases RelatedOrganelles & Diseases Related
Organelles & Diseases Related
 
Biochemistry Honors
Biochemistry HonorsBiochemistry Honors
Biochemistry Honors
 
Biology case study #4
Biology case study #4Biology case study #4
Biology case study #4
 
Case study #3
Case study #3Case study #3
Case study #3
 
State v. Mott: A Case Study in Forensic Science
State v. Mott: A Case Study in Forensic ScienceState v. Mott: A Case Study in Forensic Science
State v. Mott: A Case Study in Forensic Science
 
Case 1
Case 1Case 1
Case 1
 
Case Study 2
Case Study 2Case Study 2
Case Study 2
 
Chloroplast dna
Chloroplast dnaChloroplast dna
Chloroplast dna
 
Chemical Bonding
Chemical BondingChemical Bonding
Chemical Bonding
 
Case study on forensic audit
Case study on forensic auditCase study on forensic audit
Case study on forensic audit
 
Sickle Cell Anemia
Sickle Cell AnemiaSickle Cell Anemia
Sickle Cell Anemia
 
DNA Replication in eukaryotes and prokaryotes
DNA Replication in eukaryotes and prokaryotesDNA Replication in eukaryotes and prokaryotes
DNA Replication in eukaryotes and prokaryotes
 

Similar to Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

C. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference Talk
C. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference TalkC. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference Talk
C. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference Talk
Carlo Contaldi
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities
Thomas Zimmermann
 
Terminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryTerminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom Discovery
Giuseppe Rizzo
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Elia Brodsky
 
Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...
Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...
Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...
Miguel Velez
 
Fast Automated Unpacking and Classification of Malware
Fast Automated Unpacking and Classification of MalwareFast Automated Unpacking and Classification of Malware
Fast Automated Unpacking and Classification of MalwareSilvio Cesare
 
A Configurable CEGAR Framework with Interpolation-Based Refinements
A Configurable CEGAR Framework with Interpolation-Based RefinementsA Configurable CEGAR Framework with Interpolation-Based Refinements
A Configurable CEGAR Framework with Interpolation-Based Refinements
Akos Hajdu
 
EXTENT-2017: Keep Investing in QA
EXTENT-2017: Keep Investing in QAEXTENT-2017: Keep Investing in QA
EXTENT-2017: Keep Investing in QA
Iosif Itkin
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2
Li Shen
 
OpenCL applications in genomics
OpenCL applications in genomicsOpenCL applications in genomics
OpenCL applications in genomicsUSC
 
Introduction to Julia
Introduction to JuliaIntroduction to Julia
Introduction to Julia
岳華 杜
 
2018. gwas data cleaning
2018. gwas data cleaning2018. gwas data cleaning
2018. gwas data cleaning
FOODCROPS
 
20170415 當julia遇上資料科學
20170415 當julia遇上資料科學20170415 當julia遇上資料科學
20170415 當julia遇上資料科學
岳華 杜
 
20171127 當julia遇上資料科學
20171127 當julia遇上資料科學20171127 當julia遇上資料科學
20171127 當julia遇上資料科學
岳華 杜
 
NIPS2007: structured prediction
NIPS2007: structured predictionNIPS2007: structured prediction
NIPS2007: structured predictionzukun
 
CDAC 2018 Pellegrini clustering ppi networks
CDAC 2018 Pellegrini clustering ppi networksCDAC 2018 Pellegrini clustering ppi networks
CDAC 2018 Pellegrini clustering ppi networks
Marco Antoniotti
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)
Daniel Chan
 
Project Presentation
Project PresentationProject Presentation
Project Presentationbutest
 
2015 osu-metagenome
2015 osu-metagenome2015 osu-metagenome
2015 osu-metagenome
c.titus.brown
 

Similar to Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16] (20)

C. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference Talk
C. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference TalkC. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference Talk
C. Contaldi, F. Vafaee, P. C. Nelson - GECCO'17 Conference Talk
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities
 
Terminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom DiscoveryTerminological cluster trees for Disjointness Axiom Discovery
Terminological cluster trees for Disjointness Axiom Discovery
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
 
Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...
Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...
Exploiting Structure and Behavior of Highly Configurable Systems to Measure P...
 
Fast Automated Unpacking and Classification of Malware
Fast Automated Unpacking and Classification of MalwareFast Automated Unpacking and Classification of Malware
Fast Automated Unpacking and Classification of Malware
 
A Configurable CEGAR Framework with Interpolation-Based Refinements
A Configurable CEGAR Framework with Interpolation-Based RefinementsA Configurable CEGAR Framework with Interpolation-Based Refinements
A Configurable CEGAR Framework with Interpolation-Based Refinements
 
EXTENT-2017: Keep Investing in QA
EXTENT-2017: Keep Investing in QAEXTENT-2017: Keep Investing in QA
EXTENT-2017: Keep Investing in QA
 
Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2Bioinfo ngs data format visualization v2
Bioinfo ngs data format visualization v2
 
OpenCL applications in genomics
OpenCL applications in genomicsOpenCL applications in genomics
OpenCL applications in genomics
 
Introduction to Julia
Introduction to JuliaIntroduction to Julia
Introduction to Julia
 
2018. gwas data cleaning
2018. gwas data cleaning2018. gwas data cleaning
2018. gwas data cleaning
 
20170415 當julia遇上資料科學
20170415 當julia遇上資料科學20170415 當julia遇上資料科學
20170415 當julia遇上資料科學
 
20171127 當julia遇上資料科學
20171127 當julia遇上資料科學20171127 當julia遇上資料科學
20171127 當julia遇上資料科學
 
NIPS2007: structured prediction
NIPS2007: structured predictionNIPS2007: structured prediction
NIPS2007: structured prediction
 
CDAC 2018 Pellegrini clustering ppi networks
CDAC 2018 Pellegrini clustering ppi networksCDAC 2018 Pellegrini clustering ppi networks
CDAC 2018 Pellegrini clustering ppi networks
 
Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)Machine Learning: Classification Concepts (Part 1)
Machine Learning: Classification Concepts (Part 1)
 
Project Presentation
Project PresentationProject Presentation
Project Presentation
 
Benchmarking_ML_Tools
Benchmarking_ML_ToolsBenchmarking_ML_Tools
Benchmarking_ML_Tools
 
2015 osu-metagenome
2015 osu-metagenome2015 osu-metagenome
2015 osu-metagenome
 

Recently uploaded

DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
Wasswaderrick3
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
Toxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and ArsenicToxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and Arsenic
sanjana502982
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 

Recently uploaded (20)

DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
Toxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and ArsenicToxic effects of heavy metals : Lead and Arsenic
Toxic effects of heavy metals : Lead and Arsenic
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 

Searching for Configurations in Clone Evaluation: A Replication Study [SSBSE'16]

  • 1. Searching for Configurations in Clone Evaluation: A Replication Study C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke J. H. Drake CENTRE FOR RESEARCH ON EVOLUTION, SEARCH AND TESTING DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY COLLEGE LONDON
  • 2. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Code Clone 2
  • 3. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Clone Detectors 3 if (x==0) then y=y+1; if (check==0) then count=count+1; $p ($p==0) $p $p=$p+1; $p ($p==0) $p $p=$p+1; if_s if ( cond_e ) then assign_e if_s if ( cond_e ) then assign_e Deckard CCFinder Simian NiCad
  • 4. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Oracle Problem in Code Clone Absence of the possibility to establish a ground truth, we do not know if code is actually cloned 4 ?
  • 5. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Agreement 5 ?
  • 6. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Parameters Tuning 6
  • 7. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake EvaClone 7 T. Wang, M. Harman., Y. Jia, & J. Krinke. Searching for Better Configurations: A Rigorous Approach to Clone Evaluation. in FSE’13 6 Clone Detectors: PMD, iClones ConQAT, Simian, NiCad, CCFinder 8 Software Projects: weltab, cook, snns, psql, javadoc, ant, jdtcore, swing 15 years
  • 8. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Maximising Agreement 8 C D N S Maximise Clone detectors
  • 9. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake EvaClone (cont.) 9 EvaClone favors recall over precision 
 and more candidates will be reported.
  • 10. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Replication Study 10
  • 11. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Fitness Function 11 4x3x2x1x ++ + 4 x (All clone lines)
  • 12. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Replication Study (cont.) 12 Deckard CCFinder Simian NiCad 25 parameters Population size 100 No. of Generation 100 Crossover 0.8 Mutation 0.1 Elitism 0.25 2 x 1012
  • 13. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake 13 Ver. 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44 SLOC (k) 5.5 6.7 6.78 6.82 7.2 7.6 8.4 8.9 10.1 12.4 17.9 22.8 23.6 25.3 %Inc N/A 21% 2% 1% 6% 5% 11% 7% 13% 23% 44% 28% 3% 8% Note: there are 2 complete libraries (cglib and asm) embedded in release 1.5 — 1.9 and have been removed before the analysis
  • 14. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake RQ1: Optimised Agreement How do the default parameters perform in terms of clone agreement on each Mockito release compared to the optimised ones? 14 0.30 0.35 0.40 0.45 0.50 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44 Mockito FitnessValue Default EvaClone Highest EvaClone Lowest Comparison of optimised tools agreement (the highest and the lowest in 20 runs) to the default agreement over 14 Mockito releases
  • 15. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake RQ2: Stability of Optimised Parameters 15 Are there noticeable differences in the values of optimised parameters over releases? Tool Parameter DF Optimised 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44 CCFinder MinToken TKS 50 12 10 10 70 16 70 18 70 19 80 18 80 18 80 19 80 20 10 14 10 17 10 10 10 10 10 10 10 10 Deckard MinToken Stride Similarity 30 5 0.9 30 inf 0.9 50 8 1.0 50 8 1.0 50 8 1.0 50 8 1.0 50 8 1.0 50 8 1.0 50 8 1.0 50 16 0.95 50 5 1.0 50 inf 0.9 50 inf 0.9 50 inf 0.9 50 inf 0.9 NiCad MinLine MaxLine UPI Blind Abstract 6 1K 0.3 0 0 5 200 0.3 1 4 7 100 0.0 0 6 7 100 0.1 0 6 7 400 0.0 0 6 6 400 0.0 0 6 6 200 0.1 0 5 6 200 0.1 0 5 7 200 0.0 1 6 6 200 0.3 1 6 5 100 0.1 1 2 5 100 0.3 1 4 5 100 0.3 1 4 5 200 0.3 1 4 5 200 0.3 1 4
  • 16. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake RQ2: Stability of Optimised Parameters 16 Tool Parameter DF Optimised 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44 Simian ignoreCurlyBraces 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 ignoreIdentifiers 0 1 0 0 0 0 0 0 0 1 1 1 1 1 1 ignoreIdentifierCase 0 ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ✱ ignoreStrings 0 1 0 0 0 0 0 0 0 1 0 ✱ ✱ ✱ ✱ ignoreStringCase 1 ✱ 1 1 0 0 0 0 0 ✱ 0 ✱ ✱ ✱ ✱ ignoreNumbers 0 1 0 1 0 1 1 0 1 1 0 ✱ ✱ ✱ ✱ ignoreCharacters 0 0 0 1 0 0 0 1 0 0 1 ✱ ✱ ✱ ✱ ignoreCharacterCase 1 0 0 ✱ 1 1 0 ✱ 1 1 ✱ ✱ ✱ ✱ ✱ ignoreLiterals 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 ignoreSubtypeNames 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 ignoreModifiers 1 1 1 0 1 0 0 0 0 0 0 1 1 1 1 ignoreVariableNames 0 1 0 0 0 0 0 0 0 1 1 0 0 0 1 balanceParentheses 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 balanceSquareBrackets 0 1 0 0 0 1 1 0 1 1 1 1 1 1 0 MinLine 6 5 6 6 6 6 6 6 6 7 7 5 5 5 5 Are there noticeable differences in the values of optimised parameters over releases?
  • 17. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake RQ3: Clones over Releases 17 How many clones in Mockito are reported with the highest agreement over releases? DefaultEvaClone
  • 18. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Maximising Agreement 18 C D N S Maximise Clone detectors
  • 19. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake Open Challenge A better fitness function 
 for EvaClone is needed It must not only rely on the number of cloned lines, but also include other aspects: How often a line is found to be cloned to other places? Precision vs. Recall? Location of clones 19 ???
  • 20. Searching for Configurations in Clone Evaluation: A Replication Study — C. Ragkhitwetsagul, M. Paixao, M. Adham, S. Busari, J. Krinke, J. H. Drake 20 0.30 0.35 0.40 0.45 0.50 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2.0.0 2.0.44 Mockito FitnessValue Default EvaClone Highest EvaClone Lowest Opt. params vs Def. params Tool Parameter D F Optimised 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 2. 0. 0 2. 0. 44 CCFinder MinToken TKS 5 0 1 2 10 10 70 16 70 18 70 19 80 18 80 18 80 19 80 20 10 14 10 17 10 10 10 10 10 10 10 10 Deckard MinToken Stride Similarity 3 0 5 0. 9 30 inf 0. 9 50 8 1. 0 50 8 1. 0 50 8 1. 0 50 8 1. 0 50 8 1. 0 50 8 1. 0 50 8 1. 0 50 16 0. 95 50 5 1. 0 50 inf 0. 9 50 inf 0. 9 50 inf 0. 9 50 inf 0. 9 NiCad MinLine MaxLine UPI Blind Abstract 6 1 K 0. 3 0 0 5 20 0 0. 3 1 4 7 10 0 0. 0 0 6 7 10 0 0. 1 0 6 7 40 0 0. 0 0 6 6 40 0 0. 0 0 6 6 20 0 0. 1 0 5 6 20 0 0. 1 0 5 7 20 0 0. 0 1 6 6 20 0 0. 3 1 6 5 10 0 0. 1 1 2 5 10 0 0. 3 1 4 5 10 0 0. 3 1 4 5 20 0 0. 3 1 4 5 20 0 0. 3 1 4 Opt. params are not stable over releases DefaultEvaClone Fitness func. needs improvements