SlideShare a Scribd company logo
1 of 23
Download to read offline
Introducing Parameter Sensitivity to
Dynamic Code-Clone Analysis Methods
Toshihiro Kamiya
Interdisciplinary Graduate School of Sci. & Eng., Shimane Univ.
kamiya@cis.shimane-u.ac.jp
10th Int'l Workshop on Software Clones, Osaka
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 2
Outline
●
What is a dynamic code-clone analysis?
– Detection
– Visualization
– Samples
●
Parameter sensitivity
– Possible alternative techniques
[Position Paper] Toshihiro Kamiya, Introducing Parameter Sensitivity to Dynamic
Code-Clone Analysis Methods, Proc. 10th International Workshop on Software Clones
(IWSC 2016), pp. 19-20, 2016.
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 3
Dynamic code-clone analysis
●
Definition:
– Use dynamic information:
●
To detect code clones
●
To visualize such code clones
●
Aims/applications:
– Detect code clones between a code fragment and its restructured
(refactored) one
●
Observe evolution of code clones in clone management
– Find code clones w/ similarity in deep semantics (or behavior)
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 4
Detection method
●
Detection Steps
1. Collect execution trace(s) by running target program(s)
2. Find sub-sequences of the similar method invocations
3. Map such sub-sequences into code fragments
Toshihiro Kamiya, "An Execution-Semantic and Content-and-Context-Based Code-
Clone Detection and Analysis," IWSC 2015, pp. 1-7 (Mar. 6, 2015).
The details are described in
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 5
Detection method (implementation)
An implementation of step “2. Find sub-sequences of the similar method invocations”
●
Just AN implelentation. Could utilize another data structures/algorithms
2-1. Generate call tree from execution trace.
2-2. For each node of call tree, generate a SB data structure.
– String balloon incl.
●
A target node
●
Context (Location): path from root to the target node,
●
Contents: Set of nodes called by the target (both direct and indirect)
2-3. Find sets of SB having similar contents.
●
With a frequent item-set mining algorithm (hyper cubic decomposition [Uno03])
[Uno03] T. Uno, et al., An Efficient Algorithm for Enumerating Closed Patterns in
Transaction Databases, Discovery Science,LNCS 3245, pp. 16-31, 2003.
Revised from IWSC15's
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 6
Visualization method
●
Code fragments (of a clone class)
→ “root” nodes of sub-graphs in call
graph
●
Similarity
→ Methods called commonly in the
sub-graphs
●
Differences
→ Methods called solely in a sub-graph
main()
print_extensions
_w_for_stmt()
print_extensions
_w_map_func()
get_extensions()print
map()
lambda() at line 8
os.path.
splitext()
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 7
A sample code clone – code fragments
Applied to two CLI HTTP-client tools
– prog 1: https://github.com/chrislongo/HttpShell
– prog 2: https://pypi.python.org/pypi/httpie
Inputs URL, outputs HTML text.
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 8
A sample code clone – code fragments
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 9
A sample code clone – code fragments
Calling the same function:
pygments.highlight()
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 10
A sample code clone – code fragments
Similar?
- Yes.
But why?
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 11
A sample code clone – call graph
.
. .
.
.
2../ColorFormatter/get_lexer
.
.. pygments.util//
get_bool_opt
1.pygments.formatters.terminal/
TerminalFormatter/__init__ .
.
StringIO/StringIO/write
pygments.lexer//streamer
.
.
.
pygments.lexers//
_load_lexers
pygments.lexer//
__call__
1.pygments.lexers/
/guess_lexer
.
.
re//_compile
1../AnsiLogger/
print_data
pygments//highlight
2../ColorFormatter/
format_body
...
...
...
.
.
pygments//format
pygments//lex
… have common
contents.
Because these method calls of
guess_lexer() and get_lexer() ...
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 12
●
But this example is the best one in an experiment.
●
Not always so lucky in general practice ...
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 13
A bad example from detection result
●
Code fragments calling utility functions are sometimes
detected as a code clone�
March 15, 2016
A bad example from detection result
●
Code fragments calling utility functions are sometimes
detected as a code clone ☹
– Code fragments of a clone class
●
●
●
A bad example from detection result
●
Code fragments calling utility functions are sometimes
detected as a code clone ☹
– Code fragments of a clone class
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 16
A bad example from detection result
●
Code fragments calling utility functions are sometimes detected as a
code clone ☹
– Code fragments of a clone class
●
cli.py (an entry point) from prog 2
●
_get_proxy_info() from prog 1
●
should_bypass_proxy() from prog 2
– Calling functions of regular exp. and assoc. array, i.e. utility functions
– Results in a false positive: cli.py and others
(True positive: _get_proxy_info() and should_bypass_proxy())
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 17
An idea: Parameter sensitivity
●
Execution trace also includes argument values of each method invocations �
●
Add argument value(s) to node labels
– re//_compile.’[ˆA-Za-z0-9.]+’ or
– re//_compile.’[ˆ-]+’ in place of re//_compile
to distinguish these calls of utility functions.
●
Need to introduce value semantics (may challenging )�
– ’[0-9]’ == ’d’ (when interpreted as regular exp.)
– 0xff == 255
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 18
Alternative techniques
●
Threshold about ratio of shared nodes
– Yet another parameter on clone detection ☹
●
Depends on stack depth ?�
●
Pre-defined, manual classification of “Utility” functions☹
– When target code including new(unknown) libraries
●
Considering order of method invocations
– Such as Smith-Waterman algorithm (applied to static clone detection in
[Marukami13])
– Yet another parameter of tool ☹
●
Depends on length of code fragments ?�
–[Marukami13] H. Murakami, K. Hotta, Y. Higo, H. Igaki, Gapped Code Clone
Detection with Lightweight Source Code Analysis, ICPC 2013, pp. 93-102, 2013.
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 19
Summary
●
A dynamic code-clone detection
– Based on frequent item-set mining of method invocations
●
Utility functions (methods) make false positive.
●
Possible solutions/open questions
– parameter sensitivity,
– threshold about ratio of shared nodes,
– manual classification of “Utility” functions,
– order of method invocations
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 20
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 21
Another bad example
●
format_headers() of prog2
●
print_data() of prog1
March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 23

More Related Content

What's hot

Storage class in c
Storage class in cStorage class in c
Storage class in ckash95
 
[C++ korea] effective modern c++ study item 3 understand decltype +이동우
[C++ korea] effective modern c++ study   item 3 understand decltype +이동우[C++ korea] effective modern c++ study   item 3 understand decltype +이동우
[C++ korea] effective modern c++ study item 3 understand decltype +이동우Seok-joon Yun
 
11 lec 11 storage class
11 lec 11 storage class11 lec 11 storage class
11 lec 11 storage classkapil078
 
Storage Classes and Functions
Storage Classes and FunctionsStorage Classes and Functions
Storage Classes and FunctionsJake Bond
 
Introduction of flex
Introduction of flexIntroduction of flex
Introduction of flexvip_du
 
Storage Class in C Progrmming
Storage Class in C Progrmming Storage Class in C Progrmming
Storage Class in C Progrmming Kamal Acharya
 
storage class
storage classstorage class
storage classstudent
 
Storage Class Specifiers in C++
Storage Class Specifiers in C++Storage Class Specifiers in C++
Storage Class Specifiers in C++Reddhi Basu
 
TDD in C - Recently Used List Kata
TDD in C - Recently Used List KataTDD in C - Recently Used List Kata
TDD in C - Recently Used List KataOlve Maudal
 
A Survey Of Aspect Mining Approaches
A Survey Of Aspect Mining ApproachesA Survey Of Aspect Mining Approaches
A Survey Of Aspect Mining Approacheskim.mens
 
Model Comparison for Delta-Compression
Model Comparison for Delta-CompressionModel Comparison for Delta-Compression
Model Comparison for Delta-CompressionMarkus Scheidgen
 

What's hot (20)

Storage classes
Storage classesStorage classes
Storage classes
 
Storage class in c
Storage class in cStorage class in c
Storage class in c
 
Storage classes in C
Storage classes in CStorage classes in C
Storage classes in C
 
Storage class in C Language
Storage class in C LanguageStorage class in C Language
Storage class in C Language
 
[C++ korea] effective modern c++ study item 3 understand decltype +이동우
[C++ korea] effective modern c++ study   item 3 understand decltype +이동우[C++ korea] effective modern c++ study   item 3 understand decltype +이동우
[C++ korea] effective modern c++ study item 3 understand decltype +이동우
 
Java 8
Java 8Java 8
Java 8
 
Storage class
Storage classStorage class
Storage class
 
11 lec 11 storage class
11 lec 11 storage class11 lec 11 storage class
11 lec 11 storage class
 
RAII and ScopeGuard
RAII and ScopeGuardRAII and ScopeGuard
RAII and ScopeGuard
 
Storage Classes and Functions
Storage Classes and FunctionsStorage Classes and Functions
Storage Classes and Functions
 
Introduction of flex
Introduction of flexIntroduction of flex
Introduction of flex
 
Storage classes
Storage classesStorage classes
Storage classes
 
Storage Class in C Progrmming
Storage Class in C Progrmming Storage Class in C Progrmming
Storage Class in C Progrmming
 
storage class
storage classstorage class
storage class
 
C q 3
C q 3C q 3
C q 3
 
Storage Class Specifiers in C++
Storage Class Specifiers in C++Storage Class Specifiers in C++
Storage Class Specifiers in C++
 
TDD in C - Recently Used List Kata
TDD in C - Recently Used List KataTDD in C - Recently Used List Kata
TDD in C - Recently Used List Kata
 
Virtual Functions
Virtual FunctionsVirtual Functions
Virtual Functions
 
A Survey Of Aspect Mining Approaches
A Survey Of Aspect Mining ApproachesA Survey Of Aspect Mining Approaches
A Survey Of Aspect Mining Approaches
 
Model Comparison for Delta-Compression
Model Comparison for Delta-CompressionModel Comparison for Delta-Compression
Model Comparison for Delta-Compression
 

Similar to Introducing Parameter Sensitivity to Dynamic Code-Clone Analysis Methods

Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...IRJET Journal
 
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...IRJET Journal
 
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareThe Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareAndrey Karpov
 
(6) c sharp introduction_advanced_features_part_i
(6) c sharp introduction_advanced_features_part_i(6) c sharp introduction_advanced_features_part_i
(6) c sharp introduction_advanced_features_part_iNico Ludwig
 
Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016ijcsbi
 
Assignment1 B 0
Assignment1 B 0Assignment1 B 0
Assignment1 B 0Mahmoud
 
A Project Based Lab Report On AMUZING JOKE
A Project Based Lab Report On AMUZING JOKEA Project Based Lab Report On AMUZING JOKE
A Project Based Lab Report On AMUZING JOKEDaniel Wachtel
 
CORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSIS
CORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSISCORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSIS
CORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSISijseajournal
 
compressed.tracemonkey-pldi-09.pdf
compressed.tracemonkey-pldi-09.pdfcompressed.tracemonkey-pldi-09.pdf
compressed.tracemonkey-pldi-09.pdfGinaMartinezTacuchi
 
compressed.tracemonkey-pldi-09.pdf
compressed.tracemonkey-pldi-09.pdfcompressed.tracemonkey-pldi-09.pdf
compressed.tracemonkey-pldi-09.pdfMaherEmad1
 
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Yusuke Oda
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce AlgorithmsAmund Tveit
 
Compiler_Project_Srikanth_Vanama
Compiler_Project_Srikanth_VanamaCompiler_Project_Srikanth_Vanama
Compiler_Project_Srikanth_VanamaSrikanth Vanama
 
Matlab and Python: Basic Operations
Matlab and Python: Basic OperationsMatlab and Python: Basic Operations
Matlab and Python: Basic OperationsWai Nwe Tun
 
New c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_ivNew c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_ivNico Ludwig
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsSrinath Perera
 

Similar to Introducing Parameter Sensitivity to Dynamic Code-Clone Analysis Methods (20)

Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
 
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
Study on Different Code-Clone Detection Techniques & Approaches to MitigateCo...
 
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source SoftwareThe Use of Static Code Analysis When Teaching or Developing Open-Source Software
The Use of Static Code Analysis When Teaching or Developing Open-Source Software
 
Ase02 dmp.ppt
Ase02 dmp.pptAse02 dmp.ppt
Ase02 dmp.ppt
 
(6) c sharp introduction_advanced_features_part_i
(6) c sharp introduction_advanced_features_part_i(6) c sharp introduction_advanced_features_part_i
(6) c sharp introduction_advanced_features_part_i
 
Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016Vol 16 No 2 - July-December 2016
Vol 16 No 2 - July-December 2016
 
Assignment1 B 0
Assignment1 B 0Assignment1 B 0
Assignment1 B 0
 
Memory models in c#
Memory models in c#Memory models in c#
Memory models in c#
 
A Project Based Lab Report On AMUZING JOKE
A Project Based Lab Report On AMUZING JOKEA Project Based Lab Report On AMUZING JOKE
A Project Based Lab Report On AMUZING JOKE
 
CORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSIS
CORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSISCORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSIS
CORRELATING FEATURES AND CODE BY DYNAMIC AND SEMANTIC ANALYSIS
 
compressed.tracemonkey-pldi-09.pdf
compressed.tracemonkey-pldi-09.pdfcompressed.tracemonkey-pldi-09.pdf
compressed.tracemonkey-pldi-09.pdf
 
compressed.tracemonkey-pldi-09.pdf
compressed.tracemonkey-pldi-09.pdfcompressed.tracemonkey-pldi-09.pdf
compressed.tracemonkey-pldi-09.pdf
 
Introduction to Parallelization ans performance optimization
Introduction to Parallelization ans performance optimizationIntroduction to Parallelization ans performance optimization
Introduction to Parallelization ans performance optimization
 
Introduction to Parallelization ans performance optimization
Introduction to Parallelization ans performance optimizationIntroduction to Parallelization ans performance optimization
Introduction to Parallelization ans performance optimization
 
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
Compiler_Project_Srikanth_Vanama
Compiler_Project_Srikanth_VanamaCompiler_Project_Srikanth_Vanama
Compiler_Project_Srikanth_Vanama
 
Matlab and Python: Basic Operations
Matlab and Python: Basic OperationsMatlab and Python: Basic Operations
Matlab and Python: Basic Operations
 
New c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_ivNew c sharp3_features_(linq)_part_iv
New c sharp3_features_(linq)_part_iv
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 

More from Kamiya Toshihiro

ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較
ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較
ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較Kamiya Toshihiro
 
Code Difference Visualization by a Call Tree
Code Difference Visualization by a Call TreeCode Difference Visualization by a Call Tree
Code Difference Visualization by a Call TreeKamiya Toshihiro
 
実行トレース間のデータの差異に基づくデータフロー解析手法の提案
実行トレース間のデータの差異に基づくデータフロー解析手法の提案実行トレース間のデータの差異に基づくデータフロー解析手法の提案
実行トレース間のデータの差異に基づくデータフロー解析手法の提案Kamiya Toshihiro
 
コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~
コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~
コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~Kamiya Toshihiro
 
逆戻りデバッグ補助のための嵌入的スパイの試作
逆戻りデバッグ補助のための嵌入的スパイの試作逆戻りデバッグ補助のための嵌入的スパイの試作
逆戻りデバッグ補助のための嵌入的スパイの試作Kamiya Toshihiro
 
任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み
任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み
任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試みKamiya Toshihiro
 
任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案
任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案
任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案Kamiya Toshihiro
 
Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法
Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法
Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法Kamiya Toshihiro
 
WebアプリケーションのUI機能テストのためのHTML構造パターンの提案
WebアプリケーションのUI機能テストのためのHTML構造パターンの提案WebアプリケーションのUI機能テストのためのHTML構造パターンの提案
WebアプリケーションのUI機能テストのためのHTML構造パターンの提案Kamiya Toshihiro
 
An Algorithm for Keyword Search on an Execution Path
An Algorithm for Keyword Search on an Execution PathAn Algorithm for Keyword Search on an Execution Path
An Algorithm for Keyword Search on an Execution PathKamiya Toshihiro
 
And/Or/Callグラフの提案とソースコード検索への応用
And/Or/Callグラフの提案とソースコード検索への応用And/Or/Callグラフの提案とソースコード検索への応用
And/Or/Callグラフの提案とソースコード検索への応用Kamiya Toshihiro
 
PBLへのアジャイル開発手法導入の試み
PBLへのアジャイル開発手法導入の試みPBLへのアジャイル開発手法導入の試み
PBLへのアジャイル開発手法導入の試みKamiya Toshihiro
 
任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善
任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善
任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善Kamiya Toshihiro
 
任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法
任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法
任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法Kamiya Toshihiro
 

More from Kamiya Toshihiro (14)

ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較
ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較
ソースコード推薦あるいは修正の情報源としての質問掲示板とソースコードレポジトリの比較
 
Code Difference Visualization by a Call Tree
Code Difference Visualization by a Call TreeCode Difference Visualization by a Call Tree
Code Difference Visualization by a Call Tree
 
実行トレース間のデータの差異に基づくデータフロー解析手法の提案
実行トレース間のデータの差異に基づくデータフロー解析手法の提案実行トレース間のデータの差異に基づくデータフロー解析手法の提案
実行トレース間のデータの差異に基づくデータフロー解析手法の提案
 
コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~
コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~
コードクローン研究 ふりかえり ~ストロング・スタイルで行こう~
 
逆戻りデバッグ補助のための嵌入的スパイの試作
逆戻りデバッグ補助のための嵌入的スパイの試作逆戻りデバッグ補助のための嵌入的スパイの試作
逆戻りデバッグ補助のための嵌入的スパイの試作
 
任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み
任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み
任意粒度機能モデルコードクローン検出手法のリファクタリング理解への適用の試み
 
任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案
任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案
任意粒度機能モデルに基づく動的型付けプログラミング言語向けソースコード検索手法の提案
 
Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法
Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法
Web アプリケーションの UI 機能テストの ための HTML 構造パターンの抽出手法
 
WebアプリケーションのUI機能テストのためのHTML構造パターンの提案
WebアプリケーションのUI機能テストのためのHTML構造パターンの提案WebアプリケーションのUI機能テストのためのHTML構造パターンの提案
WebアプリケーションのUI機能テストのためのHTML構造パターンの提案
 
An Algorithm for Keyword Search on an Execution Path
An Algorithm for Keyword Search on an Execution PathAn Algorithm for Keyword Search on an Execution Path
An Algorithm for Keyword Search on an Execution Path
 
And/Or/Callグラフの提案とソースコード検索への応用
And/Or/Callグラフの提案とソースコード検索への応用And/Or/Callグラフの提案とソースコード検索への応用
And/Or/Callグラフの提案とソースコード検索への応用
 
PBLへのアジャイル開発手法導入の試み
PBLへのアジャイル開発手法導入の試みPBLへのアジャイル開発手法導入の試み
PBLへのアジャイル開発手法導入の試み
 
任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善
任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善
任意粒度機能モデルに基づくコードクローン検出手法の大規模プログラムの適用に向けた改善
 
任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法
任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法
任意粒度機能モデルに基づくバイトコードからのコードクローン検出手法
 

Recently uploaded

11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdfHafizMudaserAhmad
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdfCaalaaAbdulkerim
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catcherssdickerson1
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Romil Mishra
 
Levelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodLevelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodManicka Mamallan Andavar
 
OOP concepts -in-Python programming language
OOP concepts -in-Python programming languageOOP concepts -in-Python programming language
OOP concepts -in-Python programming languageSmritiSharma901052
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating SystemRashmi Bhat
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Coursebim.edu.pl
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTSneha Padhiar
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdfAkritiPradhan2
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionSneha Padhiar
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Erbil Polytechnic University
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...Erbil Polytechnic University
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfBalamuruganV28
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Sumanth A
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptJohnWilliam111370
 
Prach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism CommunityPrach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism Communityprachaibot
 
Python Programming for basic beginners.pptx
Python Programming for basic beginners.pptxPython Programming for basic beginners.pptx
Python Programming for basic beginners.pptxmohitesoham12
 
signals in triangulation .. ...Surveying
signals in triangulation .. ...Surveyingsignals in triangulation .. ...Surveying
signals in triangulation .. ...Surveyingsapna80328
 

Recently uploaded (20)

11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf
 
Research Methodology for Engineering pdf
Research Methodology for Engineering pdfResearch Methodology for Engineering pdf
Research Methodology for Engineering pdf
 
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor CatchersTechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
TechTAC® CFD Report Summary: A Comparison of Two Types of Tubing Anchor Catchers
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________Gravity concentration_MI20612MI_________
Gravity concentration_MI20612MI_________
 
Levelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument methodLevelling - Rise and fall - Height of instrument method
Levelling - Rise and fall - Height of instrument method
 
OOP concepts -in-Python programming language
OOP concepts -in-Python programming languageOOP concepts -in-Python programming language
OOP concepts -in-Python programming language
 
Virtual memory management in Operating System
Virtual memory management in Operating SystemVirtual memory management in Operating System
Virtual memory management in Operating System
 
Katarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School CourseKatarzyna Lipka-Sidor - BIM School Course
Katarzyna Lipka-Sidor - BIM School Course
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based question
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...
 
CS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdfCS 3251 Programming in c all unit notes pdf
CS 3251 Programming in c all unit notes pdf
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
 
Prach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism CommunityPrach: A Feature-Rich Platform Empowering the Autism Community
Prach: A Feature-Rich Platform Empowering the Autism Community
 
Python Programming for basic beginners.pptx
Python Programming for basic beginners.pptxPython Programming for basic beginners.pptx
Python Programming for basic beginners.pptx
 
signals in triangulation .. ...Surveying
signals in triangulation .. ...Surveyingsignals in triangulation .. ...Surveying
signals in triangulation .. ...Surveying
 

Introducing Parameter Sensitivity to Dynamic Code-Clone Analysis Methods

  • 1. Introducing Parameter Sensitivity to Dynamic Code-Clone Analysis Methods Toshihiro Kamiya Interdisciplinary Graduate School of Sci. & Eng., Shimane Univ. kamiya@cis.shimane-u.ac.jp 10th Int'l Workshop on Software Clones, Osaka
  • 2. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 2 Outline ● What is a dynamic code-clone analysis? – Detection – Visualization – Samples ● Parameter sensitivity – Possible alternative techniques [Position Paper] Toshihiro Kamiya, Introducing Parameter Sensitivity to Dynamic Code-Clone Analysis Methods, Proc. 10th International Workshop on Software Clones (IWSC 2016), pp. 19-20, 2016.
  • 3. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 3 Dynamic code-clone analysis ● Definition: – Use dynamic information: ● To detect code clones ● To visualize such code clones ● Aims/applications: – Detect code clones between a code fragment and its restructured (refactored) one ● Observe evolution of code clones in clone management – Find code clones w/ similarity in deep semantics (or behavior)
  • 4. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 4 Detection method ● Detection Steps 1. Collect execution trace(s) by running target program(s) 2. Find sub-sequences of the similar method invocations 3. Map such sub-sequences into code fragments Toshihiro Kamiya, "An Execution-Semantic and Content-and-Context-Based Code- Clone Detection and Analysis," IWSC 2015, pp. 1-7 (Mar. 6, 2015). The details are described in
  • 5. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 5 Detection method (implementation) An implementation of step “2. Find sub-sequences of the similar method invocations” ● Just AN implelentation. Could utilize another data structures/algorithms 2-1. Generate call tree from execution trace. 2-2. For each node of call tree, generate a SB data structure. – String balloon incl. ● A target node ● Context (Location): path from root to the target node, ● Contents: Set of nodes called by the target (both direct and indirect) 2-3. Find sets of SB having similar contents. ● With a frequent item-set mining algorithm (hyper cubic decomposition [Uno03]) [Uno03] T. Uno, et al., An Efficient Algorithm for Enumerating Closed Patterns in Transaction Databases, Discovery Science,LNCS 3245, pp. 16-31, 2003. Revised from IWSC15's
  • 6. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 6 Visualization method ● Code fragments (of a clone class) → “root” nodes of sub-graphs in call graph ● Similarity → Methods called commonly in the sub-graphs ● Differences → Methods called solely in a sub-graph main() print_extensions _w_for_stmt() print_extensions _w_map_func() get_extensions()print map() lambda() at line 8 os.path. splitext()
  • 7. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 7 A sample code clone – code fragments Applied to two CLI HTTP-client tools – prog 1: https://github.com/chrislongo/HttpShell – prog 2: https://pypi.python.org/pypi/httpie Inputs URL, outputs HTML text.
  • 8. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 8 A sample code clone – code fragments
  • 9. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 9 A sample code clone – code fragments Calling the same function: pygments.highlight()
  • 10. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 10 A sample code clone – code fragments Similar? - Yes. But why?
  • 11. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 11 A sample code clone – call graph . . . . . 2../ColorFormatter/get_lexer . .. pygments.util// get_bool_opt 1.pygments.formatters.terminal/ TerminalFormatter/__init__ . . StringIO/StringIO/write pygments.lexer//streamer . . . pygments.lexers// _load_lexers pygments.lexer// __call__ 1.pygments.lexers/ /guess_lexer . . re//_compile 1../AnsiLogger/ print_data pygments//highlight 2../ColorFormatter/ format_body ... ... ... . . pygments//format pygments//lex … have common contents. Because these method calls of guess_lexer() and get_lexer() ...
  • 12. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 12 ● But this example is the best one in an experiment. ● Not always so lucky in general practice ...
  • 13. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 13 A bad example from detection result ● Code fragments calling utility functions are sometimes detected as a code clone�
  • 14. March 15, 2016 A bad example from detection result ● Code fragments calling utility functions are sometimes detected as a code clone ☹ – Code fragments of a clone class ● ● ●
  • 15. A bad example from detection result ● Code fragments calling utility functions are sometimes detected as a code clone ☹ – Code fragments of a clone class
  • 16. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 16 A bad example from detection result ● Code fragments calling utility functions are sometimes detected as a code clone ☹ – Code fragments of a clone class ● cli.py (an entry point) from prog 2 ● _get_proxy_info() from prog 1 ● should_bypass_proxy() from prog 2 – Calling functions of regular exp. and assoc. array, i.e. utility functions – Results in a false positive: cli.py and others (True positive: _get_proxy_info() and should_bypass_proxy())
  • 17. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 17 An idea: Parameter sensitivity ● Execution trace also includes argument values of each method invocations � ● Add argument value(s) to node labels – re//_compile.’[ˆA-Za-z0-9.]+’ or – re//_compile.’[ˆ-]+’ in place of re//_compile to distinguish these calls of utility functions. ● Need to introduce value semantics (may challenging )� – ’[0-9]’ == ’d’ (when interpreted as regular exp.) – 0xff == 255
  • 18. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 18 Alternative techniques ● Threshold about ratio of shared nodes – Yet another parameter on clone detection ☹ ● Depends on stack depth ?� ● Pre-defined, manual classification of “Utility” functions☹ – When target code including new(unknown) libraries ● Considering order of method invocations – Such as Smith-Waterman algorithm (applied to static clone detection in [Marukami13]) – Yet another parameter of tool ☹ ● Depends on length of code fragments ?� –[Marukami13] H. Murakami, K. Hotta, Y. Higo, H. Igaki, Gapped Code Clone Detection with Lightweight Source Code Analysis, ICPC 2013, pp. 93-102, 2013.
  • 19. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 19 Summary ● A dynamic code-clone detection – Based on frequent item-set mining of method invocations ● Utility functions (methods) make false positive. ● Possible solutions/open questions – parameter sensitivity, – threshold about ratio of shared nodes, – manual classification of “Utility” functions, – order of method invocations
  • 20. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 20
  • 21. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 21 Another bad example
  • 23. March 15, 2016 10th Int'l Workshop on Software Clones, Osaka 23