SlideShare a Scribd company logo
An {Execution-Semantic,
Content-and-Context}-Based
Code-Clone
{Detection,Analysis}
Toshihiro Kamiya
Future University Hakodate
kamiya@fun.ac.jp
Toshihiro Kamiya: An Execution-Semantic and Content-and-Context-Based Code-Clone Detection and Analysis,
Proceedings of the 9th IEEE International Workshop on Software Clones (IWSC'15), pp. 1-7 (2015).
TOC
● Problem/Motivation
● Outline of proposed method
● Example
● Algorithm of clone detection
● Visualization
● Implementation
● Preliminary experiment
The problems / Motivation
● In functional PLs, developers can define their own control
structure.
– Analyzing only pre-defined control statements is no longer sufficient to
represent code pattern.
– E.g., if (C) A; else B; ⇔ myIf(C, lambdaA, lambdaB);
→ inter-procedural analysis
● Dynamic dispatching makes inter-procedural analysis difficult.
– Esp. in functional + OO + dynamically typed PLs
(no explicit type declaration → hard to analyze dispatches in a static
way)
Idea
Detect clones from an execution trace !
● Dispatches and control structures have been
expanded (resolved).
● Detected clones are inter-procedural, type 3
clones.
Outline of proposed method
● Execution trace
→ Call tree
→ Contents and Context (for each node)
●
main()
os.listdir()
print_extensions
_w_for_stmt()
print_extensions
_w_map_func()
os.path.
splitext() print str.join()get_extensions() print
map()
lambda() at line 8
os.path.
splitext()
contents
context
Clone detection
Clone analysis
Contents
Context
Example code
These two functions are...
A helper function
...a semantic clone.
The same
functionality: finds
extensions of given
files and prints
them out
Shared items
and differences
Distinct loops.
for vs map
All shared items are
contained in a function.
Shared items are
spread into functions.
Detection steps
Input: a call tree (← execution trace ← target
program)
1. Extracts contents and context of each node
2. Identifies sets of contents-sharing nodes
3. Removes redundant nodes (filtering with
contexts)
Input
…
call __main__//<module> runpy//_run_code 69
:
load_const __main__//<module> 0
load_const __main__//<module> 12
load_const __main__//<module> 21
load_const __main__//<module> 30
load_const __main__//<module> 39
call __main__//main __main__//<module> 63
:
call __main__//print_extensions_w_for_stmt __main__//main 24
: <list>
call posixpath//splitext __main__//print_extensions_w_for_stmt 25
: 'about.txt'
call genericpath//_splitext posixpath//splitext 18
: 'about.txt' '/' None '.'
load_const genericpath//_splitext 0
return genericpath//_splitext 139
: * 'about' '.txt'
return posixpath//splitext 21
: * 'about' '.txt'
call pygoat.hook/Out/write __main__//print_extensions_w_for_stmt 32
: '.txt'
return pygoat.hook/Out/write 15
call pygoat.hook/Out/write __main__//print_extensions_w_for_stmt 33
: 'n'
return pygoat.hook/Out/write 15
call posixpath//splitext __main__//print_extensions_w_for_stmt 25
: 'pygoat.data'
call genericpath//_splitext posixpath//splitext 18
: 'pygoat.data' '/' None '.'
load_const genericpath//_splitext 0
return genericpath//_splitext 139
: * 'pygoat' '.data'
return posixpath//splitext 21
: * 'pygoat' '.data'
call pygoat.hook/Out/write __main__//print_extensions_w_for_stmt 32
: '.data'
return pygoat.hook/Out/write 15
call pygoat.hook/Out/write __main__//print_extensions_w_for_stmt 33
: 'n'
return pygoat.hook/Out/write 15
call posixpath//splitext __main__//print_extensions_w_for_stmt 25
: 'greeting.md'
call genericpath//_splitext posixpath//splitext 18
: 'greeting.md' '/' None '.'
load_const genericpath//_splitext 0
return genericpath//_splitext 139
: * 'greeting' '.md'
return posixpath//splitext 21
: * 'greeting' '.md'
call pygoat.hook/Out/write __main__//print_extensions_w_for_stmt 32
: '.md'
return pygoat.hook/Out/write 15
call pygoat.hook/Out/write __main__//print_extensions_w_for_stmt 33
Program
Execution trace
main()
os.listdir()
print_extensions
_w_for_stmt()
print_extensions
_w_map_func()
os.path.
splitext() print str.join()get_extensions() print
map()
lambda() at line 8
os.path.
splitext()
Call tree
Input: a call tree (← execution trace ← target
program)
1. Extracts contents and context of each node
2. Identifies sets of contents-sharing nodes
3. Removes redundant nodes (filtering with
contexts)
Step 1.
1. Extracts contents and context of each node
main()
os.listdir()
print_extensions
_w_for_stmt()
print_extensions
_w_map_func()
os.path.
splitext() print str.join()get_extensions() print
map()
lambda() at line 8
os.path.
splitext()
main()
get_extensions(),
map(),
lambda() at line 8,
os.listdir(),
os.path.split(),
print,
print_extensions_w_for_stmt(),
print_extensions_w_map_func(),
str.join()
print_extensions_w_for_stmt()
main()
os.path.split()
print
print_extensions_w_map_func()
main()
get_extensions(),
map(),
lambda() at line 8,
os.path.split(),
print,
str.join()
Input: a call tree (← execution trace ← target
program)
1. Extracts contents and context of each node
2. Identifies sets of contents-sharing nodes
3. Removes redundant nodes (filtering with
contexts)
Step 2.
2. Identifies sets of contents-sharing nodes
main()
get_extensions(),
map(),
lambda() at line 8,
os.listdir(),
os.path.split(),
print,
print_extensions_w_for_stmt(),
print_extensions_w_map_func(),
str.join()
print_extensions_w_for_stmt()
main()
os.path.split()
print
print_extensions_w_map_func()
main()
get_extensions(),
map(),
lambda() at line 8,
os.path.split(),
print,
str.join()
Input: a call tree (← execution trace ← target
program)
1. Extracts contents and context of each node
2. Identifies sets of contents-sharing nodes
3. Removes redundant nodes (filtering with
contexts)
Step 3.
3. Removes redundant nodes (filtering with
contexts) main()
get_extensions(),
map(),
lambda() at line 8,
os.listdir(),
os.path.split(),
print,
print_extensions_w_for_stmt(),
print_extensions_w_map_func(),
str.join()
print_extensions_w_for_stmt()
main()
os.path.split()
print
print_extensions_w_map_func()
main()
get_extensions(),
map(),
lambda() at line 8,
os.path.split(),
print,
str.join()
Included by all of other
nodes in the set
⇒ redundant
Input: a call tree (← execution trace ← target
program)
1. Extracts contents and context of each node
2. Identifies sets of contents-sharing nodes
3. Removes redundant nodes (filtering with
contexts)
Detection result
A clone class:
{ print_extensions_w_map_func(),
print_extensions_w_for_stmt() }
Shared items:
{ os.path.split(), print }
print_extensions_w_for_stmt()
main()
os.path.split()
print
print_extensions_w_map_func()
main()
get_extensions(),
map(),
lambda() at line 8,
os.path.split(),
print,
str.join()
Detection result
A clone class:
{ print_extensions_w_map_func(),
print_extensions_w_for_stmt() }
Shared items:
{ os.path.split(), print }
dagified (merged) by label
(DAG = directed acyclic graph)
Context
Contents
main()
print_extensions
_w_for_stmt()
print_extensions
_w_map_func()
get_extensions()print
map()
lambda() at line 8
os.path.
splitext()
Content-and-context analysis for triaging
● Clone class (a), shared items (b), distinct contents (or gap) (c)
● The distinct contents (c) shared the same set of
(sub-)contents (d) → (c) is another clone class.
● If (c) is merged before (a), (c) will not be a gap of (a)
anymore.
(a)
(b)
(c)
(d)
Detected from markdown2's
code (described later)
Tool prototype
Target program Inputs / Test
cases
Execution
(Python
interpreter)
Execution trace
Debugging /
profiling APIs
Execution trace
extraction
String balloon
generation
String balloons
Frequent item-set
mining
(Apriori)
Similar sets of
contents
Redundant context
removal
Code clones
Step 1
Step 2
Step 3
Detection
Visualization Metrics calculation
Analysis
● Input: Python source code
● Uses a frequent item-set mining
algorithm / implementation
– Apriori (www.borgelt.net/apriori.html)
● Heuristics / optimizations
– Max. depth of contents from a target node
(default 5)
– Max. number of content items of a
candidate node (default 25)
● Filters out the nodes with large contents, i.e.,
nodes near to the root of call tree
– Removal of basic, primitive functions
– ...
Content-and-context clone on call graph
Preliminary experiment
for each of the parameter(“Max. number of
content items of a candidate node”) values:
10, 15, …, 30.
Target product Collection of exe. seq. # function
calls
# unique
labels
markdown2 Running 144 unit tests 227K 1128
wxPython Invoking a sample
program “pySketch”
483K 1058
Results
Results
Exponential to
number of contents
Too “peaky” for practical use
Summary
● A code-clone detection from a dynamic info, execution trace
– Aiming to apply functional/dynamically typed PLs
● Context-and-content analysis for triage
● Algorithm, implementation, heuristics
● Preliminary experiment
– Targets: markdown2 and wxPython
– Peaky, sensitive to a parameter Max. number of content items of a candidate node →
Needs refinements
Omitted, refer the paper:
● Threats to validity
● Future plan
(a)
(b)
(c)
(d)

More Related Content

What's hot

Notes part 8
Notes part 8Notes part 8
Notes part 8
Keroles karam khalil
 
answer-model-qp-15-pcd13pcd
answer-model-qp-15-pcd13pcdanswer-model-qp-15-pcd13pcd
answer-model-qp-15-pcd13pcd
Syed Mustafa
 
C language updated
C language updatedC language updated
C language updated
Arafat Bin Reza
 
Embedded C - Lecture 2
Embedded C - Lecture 2Embedded C - Lecture 2
Embedded C - Lecture 2
Mohamed Abdallah
 
Hands-on Introduction to the C Programming Language
Hands-on Introduction to the C Programming LanguageHands-on Introduction to the C Programming Language
Hands-on Introduction to the C Programming Language
Vincenzo De Florio
 
C Programming Project
C Programming ProjectC Programming Project
C Programming Project
Vijayananda Mohire
 
Yacc (yet another compiler compiler)
Yacc (yet another compiler compiler)Yacc (yet another compiler compiler)
Yacc (yet another compiler compiler)
omercomail
 
Advanced C Language for Engineering
Advanced C Language for EngineeringAdvanced C Language for Engineering
Advanced C Language for Engineering
Vincenzo De Florio
 
OpenGurukul : Language : C Programming
OpenGurukul : Language : C ProgrammingOpenGurukul : Language : C Programming
OpenGurukul : Language : C Programming
Open Gurukul
 
Programming languages
Programming languagesProgramming languages
Programming languages
Eelco Visser
 
C Programming Tutorial - www.infomtec.com
C Programming Tutorial - www.infomtec.comC Programming Tutorial - www.infomtec.com
C Programming Tutorial - www.infomtec.com
M-TEC Computer Education
 
C programming day#1
C programming day#1C programming day#1
C programming day#1
Mohamed Fawzy
 
C++ Programming Course
C++ Programming CourseC++ Programming Course
C++ Programming Course
Dennis Chang
 
Function overloading ppt
Function overloading pptFunction overloading ppt
Function overloading ppt
Prof. Dr. K. Adisesha
 
Overview of c language
Overview of c languageOverview of c language
Overview of c language
shalini392
 
L6
L6L6
L6
lksoo
 
'C' language notes (a.p)
'C' language notes (a.p)'C' language notes (a.p)
'C' language notes (a.p)
Ashishchinu
 
C language basics
C language basicsC language basics
C language basics
Nikshithas R
 
Unit iii
Unit iiiUnit iii
Unit iii
SHIKHA GAUTAM
 
C intro
C introC intro
C intro
SHIKHA GAUTAM
 

What's hot (20)

Notes part 8
Notes part 8Notes part 8
Notes part 8
 
answer-model-qp-15-pcd13pcd
answer-model-qp-15-pcd13pcdanswer-model-qp-15-pcd13pcd
answer-model-qp-15-pcd13pcd
 
C language updated
C language updatedC language updated
C language updated
 
Embedded C - Lecture 2
Embedded C - Lecture 2Embedded C - Lecture 2
Embedded C - Lecture 2
 
Hands-on Introduction to the C Programming Language
Hands-on Introduction to the C Programming LanguageHands-on Introduction to the C Programming Language
Hands-on Introduction to the C Programming Language
 
C Programming Project
C Programming ProjectC Programming Project
C Programming Project
 
Yacc (yet another compiler compiler)
Yacc (yet another compiler compiler)Yacc (yet another compiler compiler)
Yacc (yet another compiler compiler)
 
Advanced C Language for Engineering
Advanced C Language for EngineeringAdvanced C Language for Engineering
Advanced C Language for Engineering
 
OpenGurukul : Language : C Programming
OpenGurukul : Language : C ProgrammingOpenGurukul : Language : C Programming
OpenGurukul : Language : C Programming
 
Programming languages
Programming languagesProgramming languages
Programming languages
 
C Programming Tutorial - www.infomtec.com
C Programming Tutorial - www.infomtec.comC Programming Tutorial - www.infomtec.com
C Programming Tutorial - www.infomtec.com
 
C programming day#1
C programming day#1C programming day#1
C programming day#1
 
C++ Programming Course
C++ Programming CourseC++ Programming Course
C++ Programming Course
 
Function overloading ppt
Function overloading pptFunction overloading ppt
Function overloading ppt
 
Overview of c language
Overview of c languageOverview of c language
Overview of c language
 
L6
L6L6
L6
 
'C' language notes (a.p)
'C' language notes (a.p)'C' language notes (a.p)
'C' language notes (a.p)
 
C language basics
C language basicsC language basics
C language basics
 
Unit iii
Unit iiiUnit iii
Unit iii
 
C intro
C introC intro
C intro
 

Similar to An Execution-Semantic and Content-and-Context-Based Code-Clone Detection and Analysis

Not Your Fathers C - C Application Development In 2016
Not Your Fathers C - C Application Development In 2016Not Your Fathers C - C Application Development In 2016
Not Your Fathers C - C Application Development In 2016
maiktoepfer
 
Linux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudLinux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloud
Andrea Righi
 
Andriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tipsAndriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tips
OWASP Kyiv
 
Semmle Codeql
Semmle Codeql Semmle Codeql
Semmle Codeql
M. S.
 
02 c++g3 d (1)
02 c++g3 d (1)02 c++g3 d (1)
02 c++g3 d (1)
Mohammed Ali
 
R programming for data science
R programming for data scienceR programming for data science
R programming for data science
Sovello Hildebrand
 
LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
LAS16-501: Introduction to LLVM - Projects, Components, Integration, InternalsLAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
Linaro
 
Picking Mushrooms after Cppcheck
Picking Mushrooms after CppcheckPicking Mushrooms after Cppcheck
Picking Mushrooms after Cppcheck
Andrey Karpov
 
Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2
Vasil Remeniuk
 
C notes.pdf
C notes.pdfC notes.pdf
C notes.pdf
Durga Padma
 
Modeling the Behavior of Threads in the PREEMPT_RT Linux Kernel Using Automata
Modeling the Behavior of Threads in the PREEMPT_RT Linux Kernel Using AutomataModeling the Behavior of Threads in the PREEMPT_RT Linux Kernel Using Automata
Modeling the Behavior of Threads in the PREEMPT_RT Linux Kernel Using Automata
Daniel Bristot de Oliveira
 
breaking_dependencies_the_solid_principles__klaus_iglberger__cppcon_2020.pdf
breaking_dependencies_the_solid_principles__klaus_iglberger__cppcon_2020.pdfbreaking_dependencies_the_solid_principles__klaus_iglberger__cppcon_2020.pdf
breaking_dependencies_the_solid_principles__klaus_iglberger__cppcon_2020.pdf
VishalKumarJha10
 
Go 1.10 Release Party - PDX Go
Go 1.10 Release Party - PDX GoGo 1.10 Release Party - PDX Go
Go 1.10 Release Party - PDX Go
Rodolfo Carvalho
 
Clang: More than just a C/C++ Compiler
Clang: More than just a C/C++ CompilerClang: More than just a C/C++ Compiler
Clang: More than just a C/C++ Compiler
Samsung Open Source Group
 
Internship - Final Presentation (26-08-2015)
Internship - Final Presentation (26-08-2015)Internship - Final Presentation (26-08-2015)
Internship - Final Presentation (26-08-2015)
Sean Krail
 
ceph::errorator<> throw/catch-free, compile time-checked exceptions for seast...
ceph::errorator<> throw/catch-free, compile time-checked exceptions for seast...ceph::errorator<> throw/catch-free, compile time-checked exceptions for seast...
ceph::errorator<> throw/catch-free, compile time-checked exceptions for seast...
ScyllaDB
 
Generate typings from JavaScript with TypeScript 3.7
Generate typings from JavaScript with TypeScript 3.7Generate typings from JavaScript with TypeScript 3.7
Generate typings from JavaScript with TypeScript 3.7
Benny Neugebauer
 
C++ amp on linux
C++ amp on linuxC++ amp on linux
C++ amp on linux
Miller Lee
 
Modern c++
Modern c++Modern c++
Checking the Open-Source Multi Theft Auto Game
Checking the Open-Source Multi Theft Auto GameChecking the Open-Source Multi Theft Auto Game
Checking the Open-Source Multi Theft Auto Game
Andrey Karpov
 

Similar to An Execution-Semantic and Content-and-Context-Based Code-Clone Detection and Analysis (20)

Not Your Fathers C - C Application Development In 2016
Not Your Fathers C - C Application Development In 2016Not Your Fathers C - C Application Development In 2016
Not Your Fathers C - C Application Development In 2016
 
Linux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudLinux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloud
 
Andriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tipsAndriy Shalaenko - GO security tips
Andriy Shalaenko - GO security tips
 
Semmle Codeql
Semmle Codeql Semmle Codeql
Semmle Codeql
 
02 c++g3 d (1)
02 c++g3 d (1)02 c++g3 d (1)
02 c++g3 d (1)
 
R programming for data science
R programming for data scienceR programming for data science
R programming for data science
 
LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
LAS16-501: Introduction to LLVM - Projects, Components, Integration, InternalsLAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
LAS16-501: Introduction to LLVM - Projects, Components, Integration, Internals
 
Picking Mushrooms after Cppcheck
Picking Mushrooms after CppcheckPicking Mushrooms after Cppcheck
Picking Mushrooms after Cppcheck
 
Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2Scala laboratory: Globus. iteration #2
Scala laboratory: Globus. iteration #2
 
C notes.pdf
C notes.pdfC notes.pdf
C notes.pdf
 
Modeling the Behavior of Threads in the PREEMPT_RT Linux Kernel Using Automata
Modeling the Behavior of Threads in the PREEMPT_RT Linux Kernel Using AutomataModeling the Behavior of Threads in the PREEMPT_RT Linux Kernel Using Automata
Modeling the Behavior of Threads in the PREEMPT_RT Linux Kernel Using Automata
 
breaking_dependencies_the_solid_principles__klaus_iglberger__cppcon_2020.pdf
breaking_dependencies_the_solid_principles__klaus_iglberger__cppcon_2020.pdfbreaking_dependencies_the_solid_principles__klaus_iglberger__cppcon_2020.pdf
breaking_dependencies_the_solid_principles__klaus_iglberger__cppcon_2020.pdf
 
Go 1.10 Release Party - PDX Go
Go 1.10 Release Party - PDX GoGo 1.10 Release Party - PDX Go
Go 1.10 Release Party - PDX Go
 
Clang: More than just a C/C++ Compiler
Clang: More than just a C/C++ CompilerClang: More than just a C/C++ Compiler
Clang: More than just a C/C++ Compiler
 
Internship - Final Presentation (26-08-2015)
Internship - Final Presentation (26-08-2015)Internship - Final Presentation (26-08-2015)
Internship - Final Presentation (26-08-2015)
 
ceph::errorator<> throw/catch-free, compile time-checked exceptions for seast...
ceph::errorator<> throw/catch-free, compile time-checked exceptions for seast...ceph::errorator<> throw/catch-free, compile time-checked exceptions for seast...
ceph::errorator<> throw/catch-free, compile time-checked exceptions for seast...
 
Generate typings from JavaScript with TypeScript 3.7
Generate typings from JavaScript with TypeScript 3.7Generate typings from JavaScript with TypeScript 3.7
Generate typings from JavaScript with TypeScript 3.7
 
C++ amp on linux
C++ amp on linuxC++ amp on linux
C++ amp on linux
 
Modern c++
Modern c++Modern c++
Modern c++
 
Checking the Open-Source Multi Theft Auto Game
Checking the Open-Source Multi Theft Auto GameChecking the Open-Source Multi Theft Auto Game
Checking the Open-Source Multi Theft Auto Game
 

More from Kamiya Toshihiro

ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較
ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較
ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較
Kamiya Toshihiro
 
Code Difference Visualization by a Call Tree
Code Difference Visualization by a Call TreeCode Difference Visualization by a Call Tree
Code Difference Visualization by a Call Tree
Kamiya Toshihiro
 
実行トレース間のデータの差異に基づくデータフロー解析手法の提案
実行トレース間のデータの差異に基づくデータフロー解析手法の提案実行トレース間のデータの差異に基づくデータフロー解析手法の提案
実行トレース間のデータの差異に基づくデータフロー解析手法の提案
Kamiya Toshihiro
 
コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~
コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~
コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~
Kamiya Toshihiro
 
逆戻りデバッグ補助のための嵌入的スパイの試作
逆戻りデバッグ補助のための嵌入的スパイの試作逆戻りデバッグ補助のための嵌入的スパイの試作
逆戻りデバッグ補助のための嵌入的スパイの試作
Kamiya Toshihiro
 
任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み
任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み
任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み
Kamiya Toshihiro
 
任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案
任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案
任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案
Kamiya Toshihiro
 
Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法
Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法
Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法
Kamiya Toshihiro
 
WebアプリケーションのUI機能テストのためのHTML構造パターンの提案
WebアプリケーションのUI機能テストのためのHTML構造パターンの提案WebアプリケーションのUI機能テストのためのHTML構造パターンの提案
WebアプリケーションのUI機能テストのためのHTML構造パターンの提案
Kamiya Toshihiro
 
An Algorithm for Keyword Search on an Execution Path
An Algorithm for Keyword Search on an Execution PathAn Algorithm for Keyword Search on an Execution Path
An Algorithm for Keyword Search on an Execution Path
Kamiya Toshihiro
 
And/Or/Callグラフの提案とソースコード検索への応用
And/Or/Callグラフの提案とソースコード検索への応用And/Or/Callグラフの提案とソースコード検索への応用
And/Or/Callグラフの提案とソースコード検索への応用
Kamiya Toshihiro
 
PBLへのアジャイル開発手法導入の試み
PBLへのアジャイル開発手法導入の試みPBLへのアジャイル開発手法導入の試み
PBLへのアジャイル開発手法導入の試みKamiya Toshihiro
 
任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善
任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善
任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善
Kamiya Toshihiro
 
任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法
任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法
任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法
Kamiya Toshihiro
 

More from Kamiya Toshihiro (14)

ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較
ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較
ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較
 
Code Difference Visualization by a Call Tree
Code Difference Visualization by a Call TreeCode Difference Visualization by a Call Tree
Code Difference Visualization by a Call Tree
 
実行トレース間のデータの差異に基づくデータフロー解析手法の提案
実行トレース間のデータの差異に基づくデータフロー解析手法の提案実行トレース間のデータの差異に基づくデータフロー解析手法の提案
実行トレース間のデータの差異に基づくデータフロー解析手法の提案
 
コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~
コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~
コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~
 
逆戻りデバッグ補助のための嵌入的スパイの試作
逆戻りデバッグ補助のための嵌入的スパイの試作逆戻りデバッグ補助のための嵌入的スパイの試作
逆戻りデバッグ補助のための嵌入的スパイの試作
 
任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み
任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み
任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み
 
任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案
任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案
任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案
 
Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法
Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法
Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法
 
WebアプリケーションのUI機能テストのためのHTML構造パターンの提案
WebアプリケーションのUI機能テストのためのHTML構造パターンの提案WebアプリケーションのUI機能テストのためのHTML構造パターンの提案
WebアプリケーションのUI機能テストのためのHTML構造パターンの提案
 
An Algorithm for Keyword Search on an Execution Path
An Algorithm for Keyword Search on an Execution PathAn Algorithm for Keyword Search on an Execution Path
An Algorithm for Keyword Search on an Execution Path
 
And/Or/Callグラフの提案とソースコード検索への応用
And/Or/Callグラフの提案とソースコード検索への応用And/Or/Callグラフの提案とソースコード検索への応用
And/Or/Callグラフの提案とソースコード検索への応用
 
PBLへのアジャイル開発手法導入の試み
PBLへのアジャイル開発手法導入の試みPBLへのアジャイル開発手法導入の試み
PBLへのアジャイル開発手法導入の試み
 
任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善
任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善
任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善
 
任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法
任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法
任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法
 

Recently uploaded

11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
PirithiRaju
 
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
PsychoTech Services
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
RDhivya6
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Modelo de slide quimica para powerpoint
Modelo  de slide quimica para powerpointModelo  de slide quimica para powerpoint
Modelo de slide quimica para powerpoint
Karen593256
 
Gadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdfGadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdf
PirithiRaju
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
hozt8xgk
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
frank0071
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Selcen Ozturkcan
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 

Recently uploaded (20)

11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
 
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Modelo de slide quimica para powerpoint
Modelo  de slide quimica para powerpointModelo  de slide quimica para powerpoint
Modelo de slide quimica para powerpoint
 
Gadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdfGadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdf
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 

An Execution-Semantic and Content-and-Context-Based Code-Clone Detection and Analysis

  • 1. An {Execution-Semantic, Content-and-Context}-Based Code-Clone {Detection,Analysis} Toshihiro Kamiya Future University Hakodate kamiya@fun.ac.jp Toshihiro Kamiya: An Execution-Semantic and Content-and-Context-Based Code-Clone Detection and Analysis, Proceedings of the 9th IEEE International Workshop on Software Clones (IWSC'15), pp. 1-7 (2015).
  • 2. TOC ● Problem/Motivation ● Outline of proposed method ● Example ● Algorithm of clone detection ● Visualization ● Implementation ● Preliminary experiment
  • 3. The problems / Motivation ● In functional PLs, developers can define their own control structure. – Analyzing only pre-defined control statements is no longer sufficient to represent code pattern. – E.g., if (C) A; else B; ⇔ myIf(C, lambdaA, lambdaB); → inter-procedural analysis ● Dynamic dispatching makes inter-procedural analysis difficult. – Esp. in functional + OO + dynamically typed PLs (no explicit type declaration → hard to analyze dispatches in a static way)
  • 4. Idea Detect clones from an execution trace ! ● Dispatches and control structures have been expanded (resolved). ● Detected clones are inter-procedural, type 3 clones.
  • 5. Outline of proposed method ● Execution trace → Call tree → Contents and Context (for each node) ● main() os.listdir() print_extensions _w_for_stmt() print_extensions _w_map_func() os.path. splitext() print str.join()get_extensions() print map() lambda() at line 8 os.path. splitext() contents context Clone detection Clone analysis Contents Context
  • 7. These two functions are... A helper function
  • 8. ...a semantic clone. The same functionality: finds extensions of given files and prints them out
  • 10. and differences Distinct loops. for vs map All shared items are contained in a function. Shared items are spread into functions.
  • 11. Detection steps Input: a call tree (← execution trace ← target program) 1. Extracts contents and context of each node 2. Identifies sets of contents-sharing nodes 3. Removes redundant nodes (filtering with contexts)
  • 12. Input … call __main__//<module> runpy//_run_code 69 : load_const __main__//<module> 0 load_const __main__//<module> 12 load_const __main__//<module> 21 load_const __main__//<module> 30 load_const __main__//<module> 39 call __main__//main __main__//<module> 63 : call __main__//print_extensions_w_for_stmt __main__//main 24 : <list> call posixpath//splitext __main__//print_extensions_w_for_stmt 25 : 'about.txt' call genericpath//_splitext posixpath//splitext 18 : 'about.txt' '/' None '.' load_const genericpath//_splitext 0 return genericpath//_splitext 139 : * 'about' '.txt' return posixpath//splitext 21 : * 'about' '.txt' call pygoat.hook/Out/write __main__//print_extensions_w_for_stmt 32 : '.txt' return pygoat.hook/Out/write 15 call pygoat.hook/Out/write __main__//print_extensions_w_for_stmt 33 : 'n' return pygoat.hook/Out/write 15 call posixpath//splitext __main__//print_extensions_w_for_stmt 25 : 'pygoat.data' call genericpath//_splitext posixpath//splitext 18 : 'pygoat.data' '/' None '.' load_const genericpath//_splitext 0 return genericpath//_splitext 139 : * 'pygoat' '.data' return posixpath//splitext 21 : * 'pygoat' '.data' call pygoat.hook/Out/write __main__//print_extensions_w_for_stmt 32 : '.data' return pygoat.hook/Out/write 15 call pygoat.hook/Out/write __main__//print_extensions_w_for_stmt 33 : 'n' return pygoat.hook/Out/write 15 call posixpath//splitext __main__//print_extensions_w_for_stmt 25 : 'greeting.md' call genericpath//_splitext posixpath//splitext 18 : 'greeting.md' '/' None '.' load_const genericpath//_splitext 0 return genericpath//_splitext 139 : * 'greeting' '.md' return posixpath//splitext 21 : * 'greeting' '.md' call pygoat.hook/Out/write __main__//print_extensions_w_for_stmt 32 : '.md' return pygoat.hook/Out/write 15 call pygoat.hook/Out/write __main__//print_extensions_w_for_stmt 33 Program Execution trace main() os.listdir() print_extensions _w_for_stmt() print_extensions _w_map_func() os.path. splitext() print str.join()get_extensions() print map() lambda() at line 8 os.path. splitext() Call tree Input: a call tree (← execution trace ← target program) 1. Extracts contents and context of each node 2. Identifies sets of contents-sharing nodes 3. Removes redundant nodes (filtering with contexts)
  • 13. Step 1. 1. Extracts contents and context of each node main() os.listdir() print_extensions _w_for_stmt() print_extensions _w_map_func() os.path. splitext() print str.join()get_extensions() print map() lambda() at line 8 os.path. splitext() main() get_extensions(), map(), lambda() at line 8, os.listdir(), os.path.split(), print, print_extensions_w_for_stmt(), print_extensions_w_map_func(), str.join() print_extensions_w_for_stmt() main() os.path.split() print print_extensions_w_map_func() main() get_extensions(), map(), lambda() at line 8, os.path.split(), print, str.join() Input: a call tree (← execution trace ← target program) 1. Extracts contents and context of each node 2. Identifies sets of contents-sharing nodes 3. Removes redundant nodes (filtering with contexts)
  • 14. Step 2. 2. Identifies sets of contents-sharing nodes main() get_extensions(), map(), lambda() at line 8, os.listdir(), os.path.split(), print, print_extensions_w_for_stmt(), print_extensions_w_map_func(), str.join() print_extensions_w_for_stmt() main() os.path.split() print print_extensions_w_map_func() main() get_extensions(), map(), lambda() at line 8, os.path.split(), print, str.join() Input: a call tree (← execution trace ← target program) 1. Extracts contents and context of each node 2. Identifies sets of contents-sharing nodes 3. Removes redundant nodes (filtering with contexts)
  • 15. Step 3. 3. Removes redundant nodes (filtering with contexts) main() get_extensions(), map(), lambda() at line 8, os.listdir(), os.path.split(), print, print_extensions_w_for_stmt(), print_extensions_w_map_func(), str.join() print_extensions_w_for_stmt() main() os.path.split() print print_extensions_w_map_func() main() get_extensions(), map(), lambda() at line 8, os.path.split(), print, str.join() Included by all of other nodes in the set ⇒ redundant Input: a call tree (← execution trace ← target program) 1. Extracts contents and context of each node 2. Identifies sets of contents-sharing nodes 3. Removes redundant nodes (filtering with contexts)
  • 16. Detection result A clone class: { print_extensions_w_map_func(), print_extensions_w_for_stmt() } Shared items: { os.path.split(), print } print_extensions_w_for_stmt() main() os.path.split() print print_extensions_w_map_func() main() get_extensions(), map(), lambda() at line 8, os.path.split(), print, str.join()
  • 17. Detection result A clone class: { print_extensions_w_map_func(), print_extensions_w_for_stmt() } Shared items: { os.path.split(), print } dagified (merged) by label (DAG = directed acyclic graph) Context Contents main() print_extensions _w_for_stmt() print_extensions _w_map_func() get_extensions()print map() lambda() at line 8 os.path. splitext()
  • 18. Content-and-context analysis for triaging ● Clone class (a), shared items (b), distinct contents (or gap) (c) ● The distinct contents (c) shared the same set of (sub-)contents (d) → (c) is another clone class. ● If (c) is merged before (a), (c) will not be a gap of (a) anymore. (a) (b) (c) (d) Detected from markdown2's code (described later)
  • 19. Tool prototype Target program Inputs / Test cases Execution (Python interpreter) Execution trace Debugging / profiling APIs Execution trace extraction String balloon generation String balloons Frequent item-set mining (Apriori) Similar sets of contents Redundant context removal Code clones Step 1 Step 2 Step 3 Detection Visualization Metrics calculation Analysis ● Input: Python source code ● Uses a frequent item-set mining algorithm / implementation – Apriori (www.borgelt.net/apriori.html) ● Heuristics / optimizations – Max. depth of contents from a target node (default 5) – Max. number of content items of a candidate node (default 25) ● Filters out the nodes with large contents, i.e., nodes near to the root of call tree – Removal of basic, primitive functions – ... Content-and-context clone on call graph
  • 20. Preliminary experiment for each of the parameter(“Max. number of content items of a candidate node”) values: 10, 15, …, 30. Target product Collection of exe. seq. # function calls # unique labels markdown2 Running 144 unit tests 227K 1128 wxPython Invoking a sample program “pySketch” 483K 1058
  • 22. Results Exponential to number of contents Too “peaky” for practical use
  • 23. Summary ● A code-clone detection from a dynamic info, execution trace – Aiming to apply functional/dynamically typed PLs ● Context-and-content analysis for triage ● Algorithm, implementation, heuristics ● Preliminary experiment – Targets: markdown2 and wxPython – Peaky, sensitive to a parameter Max. number of content items of a candidate node → Needs refinements Omitted, refer the paper: ● Threats to validity ● Future plan (a) (b) (c) (d)