SlideShare a Scribd company logo
1 of 21
Download to read offline
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
A Fast Algorithm to Locate
Concepts in Execution Traces
Soumaya Medini, Philippe Galinier, Massimiliano Di
Penta,
Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol
SOCCER Lab. & Ptidej Team, ’Ecole Polytechnique de Montr´eal,
Qu´ebec, Canada
University of Sannio, Italy
September 1, 2011
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Outline
Introduction
Motivation and Problem Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
2 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Introduction
Concept location is an important task during program
comprehension.
A typical scenario in which concept location takes part
a failure has been observed in a software system under
certain execution conditions
such conditions are hard to reproduce
the execution trace was saved during such failure.
Result: An execution trace containing a failure is saved but
we can not re-execute the same scenario.
3 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Motivation and Problem Statement
Concept location process is recently defined as trace
segmentation problem.
The textual content of methods in trace is used to split
it into segments that participate in the implementation
of concepts.
Few works attempt to automatically identify concepts in a
single execution trace.
4 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Motivation and Problem Statement
Asadi et al. [Asadi et al., 2010] presented the contribution
to identify concepts in execution trace by finding cohesive
and decoupled fragments of the trace.
Limitations
Not scalable (thousands of methods).
Based on methaheuristic search: each run may produce
a different concept assignment.
Does not assign a label to the identified segments.
Purpose: To propose a novel approach to overcome these
limitations.
5 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Background
Steps
Step 1 : System instrumentation and trace collection.
Step 2 : Pruning and compressing traces.
Step 3 : Textual analysis of Method source code.
Step 4 : Search-based concept location.
6 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Background
Step 1 : System instrumentation and trace collection
System instrumented using MODEC : tool to extract
and model sequence diagrams.
Trace Collection : Label manually a parts of the code
during execution of the instrumented system.
Trace
Trace represented as an ordered list of invoked methods with
labels.
7 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Background
Step 2 : Pruning and compressing traces
Pruning: Remove too frequent methods having
invocation greater than Q3 + 2 x IQR, with Q3 is 75%.
Compression: Remove repetitions of method invocations
using Run Length Encoding (RLE) algorithm.
8 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Background
Step 3 : Textual analysis of Method source code
Textual analysis of source code:
Extract terms from source code : identifiers and
comments.
Split terms by Camel-Case.
Remove programming language keywords and english
stop words.
Perform stemming.
Index using the tf-idf indexing mechanisms.
Apply Latent Semantic Indexing (LSI) to reduce the
term-document matrix.
9 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Background
Step 4 : Search-based concept location
Approach built upon dynamic programming algorithm to:
Compute the exact split of the execution trace into
segments.
Improve scalability.
10 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Our approach
Dynamic programming : based on overlapping sub-problems
by dividing a problem into sub-problems recursively solved.
The solution of the problem is obtained by combining the
solutions of the sub-problems.
We compute the quality of the segmentation of a trace split
into K segments using the fitness function.
COHl =
end(l)−1
i=begin(l)
end(l)
j=i+1 σ(mi ,mj )
(end(l)−begin(l)+1)·(end(l)−begin(l))/2
(1)
COUl =
end(l)
i=begin(l)
l
j=1,j<begin(l) or j>end(l)σ(mi ,mj )
(N−(end(l)−begin(l)+1))·(end(l)−begin(l)+1)
(2)
fitness =
1
K
·
K
i=1
COHi
COUi + 1
(3)
11 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Our approach
Figure: Trace splitting
Possibilies
new segment is added or
the method is attached to the last solution segment.
Coh(i)= coh(i-1)+ delta
Cou(i)= cou(i-1)- delta
delta = sim(m1,m3)+sim(m1,m4)
12 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Empirical study
RQ1 : How do the performances of the GA and DP
approaches compare in terms of fitness values,
convergence times, and numbers of segments?
RQ2 : How do the GA and DP approaches perform in
terms of overlaps be- tween the automatic segmentation
and the manually-built oracle, i.e., recall?
RQ3 : How do the precision values of the GA and DP
approaches compare when splitting execution traces?
13 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Empirical study
Table: Data of the empirical study.
Systems Scenarios
OriginalSize
CleanedSizes
CompressedSizes
ArgoUML v0.18.1
(1)Start, Create note, Stop 34,746 821 588
(2) Start, Create class,
Create note, Stop
64,947 1,066 764
JHotDraw v1.5
(1) Start, Draw rectangle,
Stop
6,668 447 240
(2) Start, Add text, Draw
rectangle, Stop
13,841 753 361
(3) Start, Draw rectangle,
Cut rectangle, Stop
11,215 1,206 414
(4) Start, Spawn window,
Draw circle, Stop
16,366 670 433
14 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Empirical study
RQ1 Results
Compare the GA approach proposed by Asadi et al.
[Asadi et al., 2010] with our approach.
Table: Number of segments, values of fitness
function/segmentation score, and times required by the GA and
DP approaches.
System Scenario
# of Segments Fitness Time (s)
GA DP GA DP GA DP
ArgoUML
(1) 24 13 0.54 0.58 7,080 2.13
(2) 73 19 0.52 0.60 10,800 4.33
JHotDraw
(1) 17 21 0.39 0.67 2,040 0.13
(2) 21 21 0.38 0.69 1,260 0.64
(3) 56 20 0.46 0.72 1,200 0.86
(4) 63 26 0.34 0.69 240 1.00
The DP approach performs significantly better than the GA
approach.
15 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Empirical study
RQ1 and RQ2 Results
Table: Jaccard overlaps and precision values between segments
identified by the GA and DP approaches.
System Scenario Feature
Jaccard Precision
GA DP GA DP
ArgoUML
(1) Create Note 0.33 0.87 1.00 0.99
(2) Create Class 0.26 0.53 1.00 1.00
(2) Create Note 0.34 0.56 1.00 1.00
JHotDraw
(1) Draw Rectangle 0.90 0.75 0.90 1.00
(2) Add Text 0.31 0.33 0.36 0.39
(2) Draw Rectangle 0.62 0.52 0.62 1.00
(3) Draw Rectangle 0.74 0.24 0.79 0.24
(3) Cut Rectangle 0.22 0.31 1.00 1.00
(4) Draw Circle 0.82 0.82 0.82 1.00
(4) Spawn window 0.42 0.44 1.00 1.00
16 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Conclusion
We reformulate the trace segmentation problem as a
dynamic programming (DP) problem
we showed that we can benefit from the overlapping
sub-problems
we obtained a dramatic gain in performance reusing
computed scores of intervals and segmentation scores
The DP approach reuses pre-computed cohesion and
coupling values among subsequent segments of an
execution trace, which is not possible using GA.
Results showed that the DP approach significantly
out-performed the GA approach in terms of:
the times required to produce the segmentations.
the scalability.
17 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Future works
Validate the scalability of the DP trace segmentation
approach using more larger traces.
Use more traces obtained from different systems, to
verify the generality of our findings.
Complement the approach with segment labeling to
make the produced segments more suitable for
program-comprehension activities.
18 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
THANKS !
19 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Asadi, F., Penta, M. D., Antoniol, G., and Gu´eh´eneuc,
Y.-G. (2010).
A heuristic-based approach to identify concepts in
execution traces.
In Proceedings of the European Conference on Software
Maintenance and Reengineering, pages 31–40. IEEE
Computer Society Press.
Gethers, M. and Poshyvanyk, D. (2010).
Using relational topic models to capture coupling among
classes in object-oriented software systems.
In Proceedings of the 2010 IEEE International
Conference on Software Maintenance, pages 1–10. IEEE
Computer Society.
Liu, Y., Poshyvanyk, D., Ferenc, R., Gyim˜A¸sthy, T., and
Chrisochoides, N. (2009).
Modeling class cohesion as mixtures of latent topics.
In International Conference on Software Maintenance,
pages 233–242.
19 / 19
A Fast Algorithm
to Locate
Concepts in
Execution Traces
Soumaya Medini,
Philippe Galinier,
Massimiliano Di
Penta,
Yann-Ga¨el
Gu´eh´eneuc,
Giuliano Antoniol
Introduction
Motivation and
Problem
Statement
Background
Our approach
Empirical study
Conclusion
Future works
References
Lukins, S. K., Kraft, N. A., and Etzkorn, L. H. (2010).
Bug localization using latent dirichlet allocation.
Information and Software Technology, 52(9):972 – 990.
19 / 19

More Related Content

Similar to Ssbse11b.ppt

Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
butest
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
butest
 
jpl_final_presentation_miguelsimaov2
jpl_final_presentation_miguelsimaov2jpl_final_presentation_miguelsimaov2
jpl_final_presentation_miguelsimaov2
Miguel Simão
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
butest
 
Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...
Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...
Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...
Mark Stillwell
 
11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...
11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...
11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...
Alexander Decker
 
Don't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptxDon't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptx
Förderverein Technische Fakultät
 
Interactive Evolutionary Computation
Interactive Evolutionary ComputationInteractive Evolutionary Computation
Interactive Evolutionary Computation
ysemet
 

Similar to Ssbse11b.ppt (20)

Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Intelligent Video Surveillance System using Deep Learning
Intelligent Video Surveillance System using Deep LearningIntelligent Video Surveillance System using Deep Learning
Intelligent Video Surveillance System using Deep Learning
 
Better neuroimaging data processing: driven by evidence, open communities, an...
Better neuroimaging data processing: driven by evidence, open communities, an...Better neuroimaging data processing: driven by evidence, open communities, an...
Better neuroimaging data processing: driven by evidence, open communities, an...
 
Event-Handling Based Smart Video Surveillance System
Event-Handling Based Smart Video Surveillance SystemEvent-Handling Based Smart Video Surveillance System
Event-Handling Based Smart Video Surveillance System
 
jpl_final_presentation_miguelsimaov2
jpl_final_presentation_miguelsimaov2jpl_final_presentation_miguelsimaov2
jpl_final_presentation_miguelsimaov2
 
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...
 
Csmr12.ppt
Csmr12.pptCsmr12.ppt
Csmr12.ppt
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
From Engineer to Alchemist, There and Back Again: An Alchemist Tale
From Engineer to Alchemist, There and Back Again: An Alchemist TaleFrom Engineer to Alchemist, There and Back Again: An Alchemist Tale
From Engineer to Alchemist, There and Back Again: An Alchemist Tale
 
5.3.2nd WS. News from the COGAIN Association & DTU G.Interaction
5.3.2nd WS. News from the COGAIN Association & DTU G.Interaction5.3.2nd WS. News from the COGAIN Association & DTU G.Interaction
5.3.2nd WS. News from the COGAIN Association & DTU G.Interaction
 
Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...
Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...
Dynamic Fractional Resource Scheduling Practical Issues and Future Directions...
 
Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack
 
Symposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence ArtificielleSymposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence Artificielle
 
11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...
11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...
11.0003www.iiste.org call for paper.survey on wireless intelligent video surv...
 
3.survey on wireless intelligent video surveillance system using moving objec...
3.survey on wireless intelligent video surveillance system using moving objec...3.survey on wireless intelligent video surveillance system using moving objec...
3.survey on wireless intelligent video surveillance system using moving objec...
 
Don't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptxDon't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptx
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud Detection
 
Interactive Evolutionary Computation
Interactive Evolutionary ComputationInteractive Evolutionary Computation
Interactive Evolutionary Computation
 
IRJET- Anomaly Detection System in CCTV Derived Videos
IRJET- Anomaly Detection System in CCTV Derived VideosIRJET- Anomaly Detection System in CCTV Derived Videos
IRJET- Anomaly Detection System in CCTV Derived Videos
 

More from Yann-Gaël Guéhéneuc

Evolution and Examples of Java Features, from Java 1.7 to Java 22
Evolution and Examples of Java Features, from Java 1.7 to Java 22Evolution and Examples of Java Features, from Java 1.7 to Java 22
Evolution and Examples of Java Features, from Java 1.7 to Java 22
Yann-Gaël Guéhéneuc
 
Consequences and Principles of Software Quality v0.3
Consequences and Principles of Software Quality v0.3Consequences and Principles of Software Quality v0.3
Consequences and Principles of Software Quality v0.3
Yann-Gaël Guéhéneuc
 
On Reflection in OO Programming Languages v1.6
On Reflection in OO Programming Languages v1.6On Reflection in OO Programming Languages v1.6
On Reflection in OO Programming Languages v1.6
Yann-Gaël Guéhéneuc
 

More from Yann-Gaël Guéhéneuc (20)

Advice for writing a NSERC Discovery grant application v0.5
Advice for writing a NSERC Discovery grant application v0.5Advice for writing a NSERC Discovery grant application v0.5
Advice for writing a NSERC Discovery grant application v0.5
 
Ptidej Architecture, Design, and Implementation in Action v2.1
Ptidej Architecture, Design, and Implementation in Action v2.1Ptidej Architecture, Design, and Implementation in Action v2.1
Ptidej Architecture, Design, and Implementation in Action v2.1
 
Evolution and Examples of Java Features, from Java 1.7 to Java 22
Evolution and Examples of Java Features, from Java 1.7 to Java 22Evolution and Examples of Java Features, from Java 1.7 to Java 22
Evolution and Examples of Java Features, from Java 1.7 to Java 22
 
Consequences and Principles of Software Quality v0.3
Consequences and Principles of Software Quality v0.3Consequences and Principles of Software Quality v0.3
Consequences and Principles of Software Quality v0.3
 
Some Pitfalls with Python and Their Possible Solutions v0.9
Some Pitfalls with Python and Their Possible Solutions v0.9Some Pitfalls with Python and Their Possible Solutions v0.9
Some Pitfalls with Python and Their Possible Solutions v0.9
 
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
An Explanation of the Unicode, the Text Encoding Standard, Its Usages and Imp...
 
An Explanation of the Halting Problem and Its Consequences
An Explanation of the Halting Problem and Its ConsequencesAn Explanation of the Halting Problem and Its Consequences
An Explanation of the Halting Problem and Its Consequences
 
Are CPUs VMs Like Any Others? v1.0
Are CPUs VMs Like Any Others? v1.0Are CPUs VMs Like Any Others? v1.0
Are CPUs VMs Like Any Others? v1.0
 
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
Informaticien(ne)s célèbres (v1.0.2, 19/02/20)
 
Well-known Computer Scientists v1.0.2
Well-known Computer Scientists v1.0.2Well-known Computer Scientists v1.0.2
Well-known Computer Scientists v1.0.2
 
On Java Generics, History, Use, Caveats v1.1
On Java Generics, History, Use, Caveats v1.1On Java Generics, History, Use, Caveats v1.1
On Java Generics, History, Use, Caveats v1.1
 
On Reflection in OO Programming Languages v1.6
On Reflection in OO Programming Languages v1.6On Reflection in OO Programming Languages v1.6
On Reflection in OO Programming Languages v1.6
 
ICSOC'21
ICSOC'21ICSOC'21
ICSOC'21
 
Vissoft21.ppt
Vissoft21.pptVissoft21.ppt
Vissoft21.ppt
 
Service computation20.ppt
Service computation20.pptService computation20.ppt
Service computation20.ppt
 
Serp4 iot20.ppt
Serp4 iot20.pptSerp4 iot20.ppt
Serp4 iot20.ppt
 
Msr20.ppt
Msr20.pptMsr20.ppt
Msr20.ppt
 
Iwesep19.ppt
Iwesep19.pptIwesep19.ppt
Iwesep19.ppt
 
Icsoc20.ppt
Icsoc20.pptIcsoc20.ppt
Icsoc20.ppt
 
Icsoc18.ppt
Icsoc18.pptIcsoc18.ppt
Icsoc18.ppt
 

Recently uploaded

introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 

Ssbse11b.ppt

  • 1. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol SOCCER Lab. & Ptidej Team, ’Ecole Polytechnique de Montr´eal, Qu´ebec, Canada University of Sannio, Italy September 1, 2011
  • 2. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Outline Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References 2 / 19
  • 3. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Introduction Concept location is an important task during program comprehension. A typical scenario in which concept location takes part a failure has been observed in a software system under certain execution conditions such conditions are hard to reproduce the execution trace was saved during such failure. Result: An execution trace containing a failure is saved but we can not re-execute the same scenario. 3 / 19
  • 4. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Motivation and Problem Statement Concept location process is recently defined as trace segmentation problem. The textual content of methods in trace is used to split it into segments that participate in the implementation of concepts. Few works attempt to automatically identify concepts in a single execution trace. 4 / 19
  • 5. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Motivation and Problem Statement Asadi et al. [Asadi et al., 2010] presented the contribution to identify concepts in execution trace by finding cohesive and decoupled fragments of the trace. Limitations Not scalable (thousands of methods). Based on methaheuristic search: each run may produce a different concept assignment. Does not assign a label to the identified segments. Purpose: To propose a novel approach to overcome these limitations. 5 / 19
  • 6. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Background Steps Step 1 : System instrumentation and trace collection. Step 2 : Pruning and compressing traces. Step 3 : Textual analysis of Method source code. Step 4 : Search-based concept location. 6 / 19
  • 7. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Background Step 1 : System instrumentation and trace collection System instrumented using MODEC : tool to extract and model sequence diagrams. Trace Collection : Label manually a parts of the code during execution of the instrumented system. Trace Trace represented as an ordered list of invoked methods with labels. 7 / 19
  • 8. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Background Step 2 : Pruning and compressing traces Pruning: Remove too frequent methods having invocation greater than Q3 + 2 x IQR, with Q3 is 75%. Compression: Remove repetitions of method invocations using Run Length Encoding (RLE) algorithm. 8 / 19
  • 9. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Background Step 3 : Textual analysis of Method source code Textual analysis of source code: Extract terms from source code : identifiers and comments. Split terms by Camel-Case. Remove programming language keywords and english stop words. Perform stemming. Index using the tf-idf indexing mechanisms. Apply Latent Semantic Indexing (LSI) to reduce the term-document matrix. 9 / 19
  • 10. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Background Step 4 : Search-based concept location Approach built upon dynamic programming algorithm to: Compute the exact split of the execution trace into segments. Improve scalability. 10 / 19
  • 11. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Our approach Dynamic programming : based on overlapping sub-problems by dividing a problem into sub-problems recursively solved. The solution of the problem is obtained by combining the solutions of the sub-problems. We compute the quality of the segmentation of a trace split into K segments using the fitness function. COHl = end(l)−1 i=begin(l) end(l) j=i+1 σ(mi ,mj ) (end(l)−begin(l)+1)·(end(l)−begin(l))/2 (1) COUl = end(l) i=begin(l) l j=1,j<begin(l) or j>end(l)σ(mi ,mj ) (N−(end(l)−begin(l)+1))·(end(l)−begin(l)+1) (2) fitness = 1 K · K i=1 COHi COUi + 1 (3) 11 / 19
  • 12. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Our approach Figure: Trace splitting Possibilies new segment is added or the method is attached to the last solution segment. Coh(i)= coh(i-1)+ delta Cou(i)= cou(i-1)- delta delta = sim(m1,m3)+sim(m1,m4) 12 / 19
  • 13. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Empirical study RQ1 : How do the performances of the GA and DP approaches compare in terms of fitness values, convergence times, and numbers of segments? RQ2 : How do the GA and DP approaches perform in terms of overlaps be- tween the automatic segmentation and the manually-built oracle, i.e., recall? RQ3 : How do the precision values of the GA and DP approaches compare when splitting execution traces? 13 / 19
  • 14. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Empirical study Table: Data of the empirical study. Systems Scenarios OriginalSize CleanedSizes CompressedSizes ArgoUML v0.18.1 (1)Start, Create note, Stop 34,746 821 588 (2) Start, Create class, Create note, Stop 64,947 1,066 764 JHotDraw v1.5 (1) Start, Draw rectangle, Stop 6,668 447 240 (2) Start, Add text, Draw rectangle, Stop 13,841 753 361 (3) Start, Draw rectangle, Cut rectangle, Stop 11,215 1,206 414 (4) Start, Spawn window, Draw circle, Stop 16,366 670 433 14 / 19
  • 15. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Empirical study RQ1 Results Compare the GA approach proposed by Asadi et al. [Asadi et al., 2010] with our approach. Table: Number of segments, values of fitness function/segmentation score, and times required by the GA and DP approaches. System Scenario # of Segments Fitness Time (s) GA DP GA DP GA DP ArgoUML (1) 24 13 0.54 0.58 7,080 2.13 (2) 73 19 0.52 0.60 10,800 4.33 JHotDraw (1) 17 21 0.39 0.67 2,040 0.13 (2) 21 21 0.38 0.69 1,260 0.64 (3) 56 20 0.46 0.72 1,200 0.86 (4) 63 26 0.34 0.69 240 1.00 The DP approach performs significantly better than the GA approach. 15 / 19
  • 16. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Empirical study RQ1 and RQ2 Results Table: Jaccard overlaps and precision values between segments identified by the GA and DP approaches. System Scenario Feature Jaccard Precision GA DP GA DP ArgoUML (1) Create Note 0.33 0.87 1.00 0.99 (2) Create Class 0.26 0.53 1.00 1.00 (2) Create Note 0.34 0.56 1.00 1.00 JHotDraw (1) Draw Rectangle 0.90 0.75 0.90 1.00 (2) Add Text 0.31 0.33 0.36 0.39 (2) Draw Rectangle 0.62 0.52 0.62 1.00 (3) Draw Rectangle 0.74 0.24 0.79 0.24 (3) Cut Rectangle 0.22 0.31 1.00 1.00 (4) Draw Circle 0.82 0.82 0.82 1.00 (4) Spawn window 0.42 0.44 1.00 1.00 16 / 19
  • 17. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Conclusion We reformulate the trace segmentation problem as a dynamic programming (DP) problem we showed that we can benefit from the overlapping sub-problems we obtained a dramatic gain in performance reusing computed scores of intervals and segmentation scores The DP approach reuses pre-computed cohesion and coupling values among subsequent segments of an execution trace, which is not possible using GA. Results showed that the DP approach significantly out-performed the GA approach in terms of: the times required to produce the segmentations. the scalability. 17 / 19
  • 18. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Future works Validate the scalability of the DP trace segmentation approach using more larger traces. Use more traces obtained from different systems, to verify the generality of our findings. Complement the approach with segment labeling to make the produced segments more suitable for program-comprehension activities. 18 / 19
  • 19. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References THANKS ! 19 / 19
  • 20. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Asadi, F., Penta, M. D., Antoniol, G., and Gu´eh´eneuc, Y.-G. (2010). A heuristic-based approach to identify concepts in execution traces. In Proceedings of the European Conference on Software Maintenance and Reengineering, pages 31–40. IEEE Computer Society Press. Gethers, M. and Poshyvanyk, D. (2010). Using relational topic models to capture coupling among classes in object-oriented software systems. In Proceedings of the 2010 IEEE International Conference on Software Maintenance, pages 1–10. IEEE Computer Society. Liu, Y., Poshyvanyk, D., Ferenc, R., Gyim˜A¸sthy, T., and Chrisochoides, N. (2009). Modeling class cohesion as mixtures of latent topics. In International Conference on Software Maintenance, pages 233–242. 19 / 19
  • 21. A Fast Algorithm to Locate Concepts in Execution Traces Soumaya Medini, Philippe Galinier, Massimiliano Di Penta, Yann-Ga¨el Gu´eh´eneuc, Giuliano Antoniol Introduction Motivation and Problem Statement Background Our approach Empirical study Conclusion Future works References Lukins, S. K., Kraft, N. A., and Etzkorn, L. H. (2010). Bug localization using latent dirichlet allocation. Information and Software Technology, 52(9):972 – 990. 19 / 19