SlideShare a Scribd company logo
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
A Fast Algorithm to Locate
Concepts in Execution Traces
Soumaya Medini, Philippe Galinier, Massimiliano Di
Penta,
Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol
SOCCER Lab. & Ptidej Team, ’Ecole Polytechnique de Montr´eal,
Qu´ebec, Canada
University of Sannio, Italy
September 1, 2011
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Outline
Introduction
Motivation and Problem Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
2 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Introduction
Concept location is an important task during program
comprehension.
A typical scenario in which concept location takes part
a failure has been observed in a software system under
certain execution conditions
such conditions are hard to reproduce
the execution trace was saved during such failure.
Result: An execution trace containing a failure is saved but
we can not re-execute the same scenario.
3 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Motivation and Problem Statement
Concept location process is recently defined as trace
segmentation problem.
The textual content of methods in trace is used to split
it into segments that participate in the implementation
of concepts.
Few works attempt to automatically identify concepts in a
single execution trace.
4 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Motivation and Problem Statement
Asadi et al. [Asadi et al., 2010] presented the contribution
to identify concepts in execution trace by finding cohesive
and decoupled fragments of the trace.
Limitations
Not scalable (thousands of methods).
Based on methaheuristic search: each run may produce
a different concept assignment.
Does not assign a label to the identified segments.
Purpose: To propose a novel approach to overcome these
limitations.
5 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Background
Steps
Step 1 : System instrumentation and trace collection.
Step 2 : Pruning and compressing traces.
Step 3 : Textual analysis of Method source code.
Step 4 : Search-based concept location.
6 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Background
Step 1 : System instrumentation and trace collection
System instrumented using MODEC : tool to extract
and model sequence diagrams.
Trace Collection : Label manually a parts of the code
during execution of the instrumented system.
Trace
Trace represented as an ordered list of invoked methods with
labels.
7 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Background
Step 2 : Pruning and compressing traces
Pruning: Remove too frequent methods having
invocation greater than Q3 + 2 x IQR, with Q3 is 75%.
Compression: Remove repetitions of method invocations
using Run Length Encoding (RLE) algorithm.
8 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Background
Step 3 : Textual analysis of Method source code
Textual analysis of source code:
Extract terms from source code : identifiers and
comments.
Split terms by Camel-Case.
Remove programming language keywords and english
stop words.
Perform stemming.
Index using the tf-idf indexing mechanisms.
Apply Latent Semantic Indexing (LSI) to reduce the
term-document matrix.
9 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Background
Step 4 : Search-based concept location
Approach built upon dynamic programming algorithm to:
Compute the exact split of the execution trace into
segments.
Improve scalability.
10 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Our approach
Dynamic programming : based on overlapping sub-problems
by dividing a problem into sub-problems recursively solved.
The solution of the problem is obtained by combining the
solutions of the sub-problems.
We compute the quality of the segmentation of a trace split
into K segments using the fitness function.
COHl =
end(l)−1
i=begin(l)
end(l)
j=i+1 σ(mi ,mj )
(end(l)−begin(l)+1)·(end(l)−begin(l))/2
(1)
COUl =
end(l)
i=begin(l)
l
j=1,j<begin(l) or j>end(l)σ(mi ,mj )
(N−(end(l)−begin(l)+1))·(end(l)−begin(l)+1)
(2)
fitness =
1
K
·
K
i=1
COHi
COUi + 1
(3)
11 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Our approach
Figure: Trace splitting
Possibilies
new segment is added or
the method is attached to the last solution segment.
Coh(i)= coh(i-1)+ delta
Cou(i)= cou(i-1)- delta
delta = sim(m1,m3)+sim(m1,m4)
12 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Empirical study
RQ1 : How do the performances of the GA and DP
approaches compare in terms of fitness values,
convergence times, and numbers of segments?
RQ2 : How do the GA and DP approaches perform in
terms of overlaps be- tween the automatic segmentation
and the manually-built oracle, i.e., recall?
RQ3 : How do the precision values of the GA and DP
approaches compare when splitting execution traces?
13 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Empirical study
Table: Data of the empirical study.
Systems Scenarios
OriginalSize
CleanedSizes
CompressedSizes
ArgoUML v0.18.1
(1)Start, Create note, Stop 34,746 821 588
(2) Start, Create class,
Create note, Stop
64,947 1,066 764
JHotDraw v1.5
(1) Start, Draw rectangle,
Stop
6,668 447 240
(2) Start, Add text, Draw
rectangle, Stop
13,841 753 361
(3) Start, Draw rectangle,
Cut rectangle, Stop
11,215 1,206 414
(4) Start, Spawn window,
Draw circle, Stop
16,366 670 433
14 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Empirical study
RQ1 Results
Compare the GA approach proposed by Asadi et al.
[Asadi et al., 2010] with our approach.
Table: Number of segments, values of fitness
function/segmentation score, and times required by the GA and
DP approaches.
System Scenario
# of Segments Fitness Time (s)
GA DP GA DP GA DP
ArgoUML
(1) 24 13 0.54 0.58 7,080 2.13
(2) 73 19 0.52 0.60 10,800 4.33
JHotDraw
(1) 17 21 0.39 0.67 2,040 0.13
(2) 21 21 0.38 0.69 1,260 0.64
(3) 56 20 0.46 0.72 1,200 0.86
(4) 63 26 0.34 0.69 240 1.00
The DP approach performs significantly better than the GA
approach.
15 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Empirical study
RQ1 and RQ2 Results
Table: Jaccard overlaps and precision values between segments
identified by the GA and DP approaches.
System Scenario Feature
Jaccard Precision
GA DP GA DP
ArgoUML
(1) Create Note 0.33 0.87 1.00 0.99
(2) Create Class 0.26 0.53 1.00 1.00
(2) Create Note 0.34 0.56 1.00 1.00
JHotDraw
(1) Draw Rectangle 0.90 0.75 0.90 1.00
(2) Add Text 0.31 0.33 0.36 0.39
(2) Draw Rectangle 0.62 0.52 0.62 1.00
(3) Draw Rectangle 0.74 0.24 0.79 0.24
(3) Cut Rectangle 0.22 0.31 1.00 1.00
(4) Draw Circle 0.82 0.82 0.82 1.00
(4) Spawn window 0.42 0.44 1.00 1.00
16 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Conclusion
We reformulate the trace segmentation problem as a
dynamic programming (DP) problem
we showed that we can benefit from the overlapping
sub-problems
we obtained a dramatic gain in performance reusing
computed scores of intervals and segmentation scores
The DP approach reuses pre-computed cohesion and
coupling values among subsequent segments of an
execution trace, which is not possible using GA.
Results showed that the DP approach significantly
out-performed the GA approach in terms of:
the times required to produce the segmentations.
the scalability.
17 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Future works
Validate the scalability of the DP trace segmentation
approach using more larger traces.
Use more traces obtained from different systems, to
verify the generality of our findings.
Complement the approach with segment labeling to
make the produced segments more suitable for
program-comprehension activities.
18 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
THANKS !
19 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Asadi, F., Penta, M. D., Antoniol, G., and Gu´eh´eneuc,
Y.-G. (2010).
A heuristic-based approach to identify concepts in
execution traces.
In Proceedings of the European Conference on Software
Maintenance and Reengineering, pages 31–40. IEEE
Computer Society Press.
Gethers, M. and Poshyvanyk, D. (2010).
Using relational topic models to capture coupling among
classes in object-oriented software systems.
In Proceedings of the 2010 IEEE International
Conference on Software Maintenance, pages 1–10. IEEE
Computer Society.
Liu, Y., Poshyvanyk, D., Ferenc, R., Gyim˜A¸sthy, T., and
Chrisochoides, N. (2009).
Modeling class cohesion as mixtures of latent topics.
In International Conference on Software Maintenance,
pages 233–242.
19 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Lukins, S. K., Kraft, N. A., and Etzkorn, L. H. (2010).
Bug localization using latent dirichlet allocation.
Information and Software Technology, 52(9):972 – 990.
19 / 19

More Related Content

Similar to Ssbse11b.ppt

IRJET- Design an Approach for Prediction of Human Activity Recognition us...
IRJET-  	  Design an Approach for Prediction of Human Activity Recognition us...IRJET-  	  Design an Approach for Prediction of Human Activity Recognition us...
IRJET- Design an Approach for Prediction of Human Activity Recognition us...
IRJET Journal
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
butest
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
butest
 
Intelligent Video Surveillance System using Deep Learning
Intelligent Video Surveillance System using Deep LearningIntelligent Video Surveillance System using Deep Learning
Intelligent Video Surveillance System using Deep Learning
IRJET Journal
 
Better neuroimaging data processing: driven by evidence, open communities, an...
Better neuroimaging data processing: driven by evidence, open communities, an...Better neuroimaging data processing: driven by evidence, open communities, an...
Better neuroimaging data processing: driven by evidence, open communities, an...
Gael Varoquaux
 
Event-Handling Based Smart Video Surveillance System
Event-Handling Based Smart Video Surveillance SystemEvent-Handling Based Smart Video Surveillance System
Event-Handling Based Smart Video Surveillance System
CSCJournals
 
jpl_final_presentation_miguelsimaov2
jpl_final_presentation_miguelsimaov2jpl_final_presentation_miguelsimaov2
jpl_final_presentation_miguelsimaov2
Miguel Simão
 
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...
IRJET Journal
 
Csmr12.ppt
Csmr12.pptCsmr12.ppt
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
butest
 
From Engineer to Alchemist, There and Back Again: An Alchemist Tale
From Engineer to Alchemist, There and Back Again: An Alchemist TaleFrom Engineer to Alchemist, There and Back Again: An Alchemist Tale
From Engineer to Alchemist, There and Back Again: An Alchemist Tale
Danilo Pianini
 
5.3.2nd WS. News from the COGAIN Association & DTU G.Interaction
5.3.2nd WS. News from the COGAIN Association & DTU G.Interaction5.3.2nd WS. News from the COGAIN Association & DTU G.Interaction
5.3.2nd WS. News from the COGAIN Association & DTU G.Interaction
ATIS4all, European Thematic Network, led by Technosite.
 
Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...
Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...
Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...
Mark Stillwell
 
Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack
BigDataExpo
 
Symposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence ArtificielleSymposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence Artificielle
PMI-Montréal
 
3.survey on wireless intelligent video surveillance system using moving objec...
3.survey on wireless intelligent video surveillance system using moving objec...3.survey on wireless intelligent video surveillance system using moving objec...
3.survey on wireless intelligent video surveillance system using moving objec...
Alexander Decker
 
11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...
11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...
11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...
Alexander Decker
 
Don't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptxDon't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptx
Förderverein Technische Fakultät
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud Detection
Andrea Dal Pozzolo
 
Interactive Evolutionary Computation
Interactive Evolutionary ComputationInteractive Evolutionary Computation
Interactive Evolutionary Computation
ysemet
 

Similar to Ssbse11b.ppt (20)

IRJET- Design an Approach for Prediction of Human Activity Recognition us...
IRJET-  	  Design an Approach for Prediction of Human Activity Recognition us...IRJET-  	  Design an Approach for Prediction of Human Activity Recognition us...
IRJET- Design an Approach for Prediction of Human Activity Recognition us...
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Intelligent Video Surveillance System using Deep Learning
Intelligent Video Surveillance System using Deep LearningIntelligent Video Surveillance System using Deep Learning
Intelligent Video Surveillance System using Deep Learning
 
Better neuroimaging data processing: driven by evidence, open communities, an...
Better neuroimaging data processing: driven by evidence, open communities, an...Better neuroimaging data processing: driven by evidence, open communities, an...
Better neuroimaging data processing: driven by evidence, open communities, an...
 
Event-Handling Based Smart Video Surveillance System
Event-Handling Based Smart Video Surveillance SystemEvent-Handling Based Smart Video Surveillance System
Event-Handling Based Smart Video Surveillance System
 
jpl_final_presentation_miguelsimaov2
jpl_final_presentation_miguelsimaov2jpl_final_presentation_miguelsimaov2
jpl_final_presentation_miguelsimaov2
 
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...
 
Csmr12.ppt
Csmr12.pptCsmr12.ppt
Csmr12.ppt
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
From Engineer to Alchemist, There and Back Again: An Alchemist Tale
From Engineer to Alchemist, There and Back Again: An Alchemist TaleFrom Engineer to Alchemist, There and Back Again: An Alchemist Tale
From Engineer to Alchemist, There and Back Again: An Alchemist Tale
 
5.3.2nd WS. News from the COGAIN Association & DTU G.Interaction
5.3.2nd WS. News from the COGAIN Association & DTU G.Interaction5.3.2nd WS. News from the COGAIN Association & DTU G.Interaction
5.3.2nd WS. News from the COGAIN Association & DTU G.Interaction
 
Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...
Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...
Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...
 
Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack
 
Symposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence ArtificielleSymposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence Artificielle
 
3.survey on wireless intelligent video surveillance system using moving objec...
3.survey on wireless intelligent video surveillance system using moving objec...3.survey on wireless intelligent video surveillance system using moving objec...
3.survey on wireless intelligent video surveillance system using moving objec...
 
11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...
11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...
11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...
 
Don't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptxDon't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptx
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud Detection
 
Interactive Evolutionary Computation
Interactive Evolutionary ComputationInteractive Evolutionary Computation
Interactive Evolutionary Computation
 

More from Yann-Gaël Guéhéneuc

Some Pitfalls with Python and Their Possible Solutions v1.0
Some Pitfalls with Python and Their Possible Solutions v1.0Some Pitfalls with Python and Their Possible Solutions v1.0
Some Pitfalls with Python and Their Possible Solutions v1.0
Yann-Gaël Guéhéneuc
 
Advice for writing a NSERC Discovery grant application v0.5
Advice for writing a NSERC Discovery grant application v0.5Advice for writing a NSERC Discovery grant application v0.5
Advice for writing a NSERC Discovery grant application v0.5
Yann-Gaël Guéhéneuc
 
Ptidej Architecture, Design, and Implementation in Action v2.1
Ptidej Architecture, Design, and Implementation in Action v2.1Ptidej Architecture, Design, and Implementation in Action v2.1
Ptidej Architecture, Design, and Implementation in Action v2.1
Yann-Gaël Guéhéneuc
 
Evolution and Examples of Java Features, from Java 1.7 to Java 22
Evolution and Examples of Java Features, from Java 1.7 to Java 22Evolution and Examples of Java Features, from Java 1.7 to Java 22
Evolution and Examples of Java Features, from Java 1.7 to Java 22
Yann-Gaël Guéhéneuc
 
Consequences and Principles of Software Quality v0.3
Consequences and Principles of Software Quality v0.3Consequences and Principles of Software Quality v0.3
Consequences and Principles of Software Quality v0.3
Yann-Gaël Guéhéneuc
 
Some Pitfalls with Python and Their Possible Solutions v0.9
Some Pitfalls with Python and Their Possible Solutions v0.9Some Pitfalls with Python and Their Possible Solutions v0.9
Some Pitfalls with Python and Their Possible Solutions v0.9
Yann-Gaël Guéhéneuc
 
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
Yann-Gaël Guéhéneuc
 
An Explanation of the Halting Problem and Its Consequences
An Explanation of the Halting Problem and Its ConsequencesAn Explanation of the Halting Problem and Its Consequences
An Explanation of the Halting Problem and Its Consequences
Yann-Gaël Guéhéneuc
 
Are CPUs VMs Like Any Others? v1.0
Are CPUs VMs Like Any Others? v1.0Are CPUs VMs Like Any Others? v1.0
Are CPUs VMs Like Any Others? v1.0
Yann-Gaël Guéhéneuc
 
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
Yann-Gaël Guéhéneuc
 
Well-known Computer Scientists v1.0.2
Well-known Computer Scientists v1.0.2Well-known Computer Scientists v1.0.2
Well-known Computer Scientists v1.0.2
Yann-Gaël Guéhéneuc
 
On Java Generics, History, Use, Caveats v1.1
On Java Generics, History, Use, Caveats v1.1On Java Generics, History, Use, Caveats v1.1
On Java Generics, History, Use, Caveats v1.1
Yann-Gaël Guéhéneuc
 
On Reflection in OO Programming Languages v1.6
On Reflection in OO Programming Languages v1.6On Reflection in OO Programming Languages v1.6
On Reflection in OO Programming Languages v1.6
Yann-Gaël Guéhéneuc
 
ICSOC'21
ICSOC'21ICSOC'21
Vissoft21.ppt
Vissoft21.pptVissoft21.ppt
Vissoft21.ppt
Yann-Gaël Guéhéneuc
 
Service computation20.ppt
Service computation20.pptService computation20.ppt
Service computation20.ppt
Yann-Gaël Guéhéneuc
 
Serp4 iot20.ppt
Serp4 iot20.pptSerp4 iot20.ppt
Serp4 iot20.ppt
Yann-Gaël Guéhéneuc
 
Msr20.ppt
Msr20.pptMsr20.ppt
Iwesep19.ppt
Iwesep19.pptIwesep19.ppt
Icsoc20.ppt
Icsoc20.pptIcsoc20.ppt

More from Yann-Gaël Guéhéneuc (20)

Some Pitfalls with Python and Their Possible Solutions v1.0
Some Pitfalls with Python and Their Possible Solutions v1.0Some Pitfalls with Python and Their Possible Solutions v1.0
Some Pitfalls with Python and Their Possible Solutions v1.0
 
Advice for writing a NSERC Discovery grant application v0.5
Advice for writing a NSERC Discovery grant application v0.5Advice for writing a NSERC Discovery grant application v0.5
Advice for writing a NSERC Discovery grant application v0.5
 
Ptidej Architecture, Design, and Implementation in Action v2.1
Ptidej Architecture, Design, and Implementation in Action v2.1Ptidej Architecture, Design, and Implementation in Action v2.1
Ptidej Architecture, Design, and Implementation in Action v2.1
 
Evolution and Examples of Java Features, from Java 1.7 to Java 22
Evolution and Examples of Java Features, from Java 1.7 to Java 22Evolution and Examples of Java Features, from Java 1.7 to Java 22
Evolution and Examples of Java Features, from Java 1.7 to Java 22
 
Consequences and Principles of Software Quality v0.3
Consequences and Principles of Software Quality v0.3Consequences and Principles of Software Quality v0.3
Consequences and Principles of Software Quality v0.3
 
Some Pitfalls with Python and Their Possible Solutions v0.9
Some Pitfalls with Python and Their Possible Solutions v0.9Some Pitfalls with Python and Their Possible Solutions v0.9
Some Pitfalls with Python and Their Possible Solutions v0.9
 
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
 
An Explanation of the Halting Problem and Its Consequences
An Explanation of the Halting Problem and Its ConsequencesAn Explanation of the Halting Problem and Its Consequences
An Explanation of the Halting Problem and Its Consequences
 
Are CPUs VMs Like Any Others? v1.0
Are CPUs VMs Like Any Others? v1.0Are CPUs VMs Like Any Others? v1.0
Are CPUs VMs Like Any Others? v1.0
 
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
 
Well-known Computer Scientists v1.0.2
Well-known Computer Scientists v1.0.2Well-known Computer Scientists v1.0.2
Well-known Computer Scientists v1.0.2
 
On Java Generics, History, Use, Caveats v1.1
On Java Generics, History, Use, Caveats v1.1On Java Generics, History, Use, Caveats v1.1
On Java Generics, History, Use, Caveats v1.1
 
On Reflection in OO Programming Languages v1.6
On Reflection in OO Programming Languages v1.6On Reflection in OO Programming Languages v1.6
On Reflection in OO Programming Languages v1.6
 
ICSOC'21
ICSOC'21ICSOC'21
ICSOC'21
 
Vissoft21.ppt
Vissoft21.pptVissoft21.ppt
Vissoft21.ppt
 
Service computation20.ppt
Service computation20.pptService computation20.ppt
Service computation20.ppt
 
Serp4 iot20.ppt
Serp4 iot20.pptSerp4 iot20.ppt
Serp4 iot20.ppt
 
Msr20.ppt
Msr20.pptMsr20.ppt
Msr20.ppt
 
Iwesep19.ppt
Iwesep19.pptIwesep19.ppt
Iwesep19.ppt
 
Icsoc20.ppt
Icsoc20.pptIcsoc20.ppt
Icsoc20.ppt
 

Recently uploaded

INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLESINTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
anfaltahir1010
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
Marcin Chrost
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
Alina Yurenko
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
kalichargn70th171
 
ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.
Maitrey Patel
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
Bert Jan Schrijver
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 
Kubernetes at Scale: Going Multi-Cluster with Istio
Kubernetes at Scale:  Going Multi-Cluster  with IstioKubernetes at Scale:  Going Multi-Cluster  with Istio
Kubernetes at Scale: Going Multi-Cluster with Istio
Severalnines
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
brainerhub1
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
dakas1
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
Tier1 app
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
sjcobrien
 
14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision
ShulagnaSarkar2
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
gapen1
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Peter Caitens
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 

Recently uploaded (20)

INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLESINTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
INTRODUCTION TO AI CLASSICAL THEORY TARGETED EXAMPLES
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
 
ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.ACE - Team 24 Wrapup event at ahmedabad.
ACE - Team 24 Wrapup event at ahmedabad.
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 
Kubernetes at Scale: Going Multi-Cluster with Istio
Kubernetes at Scale:  Going Multi-Cluster  with IstioKubernetes at Scale:  Going Multi-Cluster  with Istio
Kubernetes at Scale: Going Multi-Cluster with Istio
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
 
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
一比一原版(UMN毕业证)明尼苏达大学毕业证如何办理
 
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSISDECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
DECODING JAVA THREAD DUMPS: MASTER THE ART OF ANALYSIS
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
 
14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
 

Ssbse11b.ppt

  • 1. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol SOCCER Lab. & Ptidej Team, ’Ecole Polytechnique de Montr´eal, Qu´ebec, Canada University of Sannio, Italy September 1, 2011
  • 2. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Outline Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References 2 / 19
  • 3. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Introduction Concept location is an important task during program comprehension. A typical scenario in which concept location takes part a failure has been observed in a software system under certain execution conditions such conditions are hard to reproduce the execution trace was saved during such failure. Result: An execution trace containing a failure is saved but we can not re-execute the same scenario. 3 / 19
  • 4. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Motivation and Problem Statement Concept location process is recently defined as trace segmentation problem. The textual content of methods in trace is used to split it into segments that participate in the implementation of concepts. Few works attempt to automatically identify concepts in a single execution trace. 4 / 19
  • 5. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Motivation and Problem Statement Asadi et al. [Asadi et al., 2010] presented the contribution to identify concepts in execution trace by finding cohesive and decoupled fragments of the trace. Limitations Not scalable (thousands of methods). Based on methaheuristic search: each run may produce a different concept assignment. Does not assign a label to the identified segments. Purpose: To propose a novel approach to overcome these limitations. 5 / 19
  • 6. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Background Steps Step 1 : System instrumentation and trace collection. Step 2 : Pruning and compressing traces. Step 3 : Textual analysis of Method source code. Step 4 : Search-based concept location. 6 / 19
  • 7. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Background Step 1 : System instrumentation and trace collection System instrumented using MODEC : tool to extract and model sequence diagrams. Trace Collection : Label manually a parts of the code during execution of the instrumented system. Trace Trace represented as an ordered list of invoked methods with labels. 7 / 19
  • 8. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Background Step 2 : Pruning and compressing traces Pruning: Remove too frequent methods having invocation greater than Q3 + 2 x IQR, with Q3 is 75%. Compression: Remove repetitions of method invocations using Run Length Encoding (RLE) algorithm. 8 / 19
  • 9. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Background Step 3 : Textual analysis of Method source code Textual analysis of source code: Extract terms from source code : identifiers and comments. Split terms by Camel-Case. Remove programming language keywords and english stop words. Perform stemming. Index using the tf-idf indexing mechanisms. Apply Latent Semantic Indexing (LSI) to reduce the term-document matrix. 9 / 19
  • 10. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Background Step 4 : Search-based concept location Approach built upon dynamic programming algorithm to: Compute the exact split of the execution trace into segments. Improve scalability. 10 / 19
  • 11. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Our approach Dynamic programming : based on overlapping sub-problems by dividing a problem into sub-problems recursively solved. The solution of the problem is obtained by combining the solutions of the sub-problems. We compute the quality of the segmentation of a trace split into K segments using the fitness function. COHl = end(l)−1 i=begin(l) end(l) j=i+1 σ(mi ,mj ) (end(l)−begin(l)+1)·(end(l)−begin(l))/2 (1) COUl = end(l) i=begin(l) l j=1,j<begin(l) or j>end(l)σ(mi ,mj ) (N−(end(l)−begin(l)+1))·(end(l)−begin(l)+1) (2) fitness = 1 K · K i=1 COHi COUi + 1 (3) 11 / 19
  • 12. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Our approach Figure: Trace splitting Possibilies new segment is added or the method is attached to the last solution segment. Coh(i)= coh(i-1)+ delta Cou(i)= cou(i-1)- delta delta = sim(m1,m3)+sim(m1,m4) 12 / 19
  • 13. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Empirical study RQ1 : How do the performances of the GA and DP approaches compare in terms of fitness values, convergence times, and numbers of segments? RQ2 : How do the GA and DP approaches perform in terms of overlaps be- tween the automatic segmentation and the manually-built oracle, i.e., recall? RQ3 : How do the precision values of the GA and DP approaches compare when splitting execution traces? 13 / 19
  • 14. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Empirical study Table: Data of the empirical study. Systems Scenarios OriginalSize CleanedSizes CompressedSizes ArgoUML v0.18.1 (1)Start, Create note, Stop 34,746 821 588 (2) Start, Create class, Create note, Stop 64,947 1,066 764 JHotDraw v1.5 (1) Start, Draw rectangle, Stop 6,668 447 240 (2) Start, Add text, Draw rectangle, Stop 13,841 753 361 (3) Start, Draw rectangle, Cut rectangle, Stop 11,215 1,206 414 (4) Start, Spawn window, Draw circle, Stop 16,366 670 433 14 / 19
  • 15. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Empirical study RQ1 Results Compare the GA approach proposed by Asadi et al. [Asadi et al., 2010] with our approach. Table: Number of segments, values of fitness function/segmentation score, and times required by the GA and DP approaches. System Scenario # of Segments Fitness Time (s) GA DP GA DP GA DP ArgoUML (1) 24 13 0.54 0.58 7,080 2.13 (2) 73 19 0.52 0.60 10,800 4.33 JHotDraw (1) 17 21 0.39 0.67 2,040 0.13 (2) 21 21 0.38 0.69 1,260 0.64 (3) 56 20 0.46 0.72 1,200 0.86 (4) 63 26 0.34 0.69 240 1.00 The DP approach performs significantly better than the GA approach. 15 / 19
  • 16. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Empirical study RQ1 and RQ2 Results Table: Jaccard overlaps and precision values between segments identified by the GA and DP approaches. System Scenario Feature Jaccard Precision GA DP GA DP ArgoUML (1) Create Note 0.33 0.87 1.00 0.99 (2) Create Class 0.26 0.53 1.00 1.00 (2) Create Note 0.34 0.56 1.00 1.00 JHotDraw (1) Draw Rectangle 0.90 0.75 0.90 1.00 (2) Add Text 0.31 0.33 0.36 0.39 (2) Draw Rectangle 0.62 0.52 0.62 1.00 (3) Draw Rectangle 0.74 0.24 0.79 0.24 (3) Cut Rectangle 0.22 0.31 1.00 1.00 (4) Draw Circle 0.82 0.82 0.82 1.00 (4) Spawn window 0.42 0.44 1.00 1.00 16 / 19
  • 17. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Conclusion We reformulate the trace segmentation problem as a dynamic programming (DP) problem we showed that we can benefit from the overlapping sub-problems we obtained a dramatic gain in performance reusing computed scores of intervals and segmentation scores The DP approach reuses pre-computed cohesion and coupling values among subsequent segments of an execution trace, which is not possible using GA. Results showed that the DP approach significantly out-performed the GA approach in terms of: the times required to produce the segmentations. the scalability. 17 / 19
  • 18. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Future works Validate the scalability of the DP trace segmentation approach using more larger traces. Use more traces obtained from different systems, to verify the generality of our findings. Complement the approach with segment labeling to make the produced segments more suitable for program-comprehension activities. 18 / 19
  • 19. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References THANKS ! 19 / 19
  • 20. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Asadi, F., Penta, M. D., Antoniol, G., and Gu´eh´eneuc, Y.-G. (2010). A heuristic-based approach to identify concepts in execution traces. In Proceedings of the European Conference on Software Maintenance and Reengineering, pages 31–40. IEEE Computer Society Press. Gethers, M. and Poshyvanyk, D. (2010). Using relational topic models to capture coupling among classes in object-oriented software systems. In Proceedings of the 2010 IEEE International Conference on Software Maintenance, pages 1–10. IEEE Computer Society. Liu, Y., Poshyvanyk, D., Ferenc, R., Gyim˜A¸sthy, T., and Chrisochoides, N. (2009). Modeling class cohesion as mixtures of latent topics. In International Conference on Software Maintenance, pages 233–242. 19 / 19
  • 21. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Lukins, S. K., Kraft, N. A., and Etzkorn, L. H. (2010). Bug localization using latent dirichlet allocation. Information and Software Technology, 52(9):972 – 990. 19 / 19