0
A Fast Algorithm   to Locate  Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di      Penta,     ...
A Fast Algorithm   to Locate         Outline  Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di ...
A Fast Algorithm   to Locate         Introduction  Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimilian...
A Fast Algorithm   to Locate         Motivation and Problem Statement  Concepts inExecution TracesSoumaya Medini,Philippe ...
A Fast Algorithm   to Locate         Motivation and Problem Statement  Concepts inExecution TracesSoumaya Medini,Philippe ...
A Fast Algorithm   to Locate         Background  Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano ...
A Fast Algorithm   to Locate         Background  Concepts inExecution Traces     Step 1 : System instrumentation and trace...
A Fast Algorithm   to Locate         Background  Concepts inExecution Traces     Step 2 : Pruning and compressing tracesSo...
A Fast Algorithm   to Locate         Background  Concepts inExecution Traces     Step 3 : Textual analysis of Method sourc...
A Fast Algorithm   to Locate         Background  Concepts inExecution Traces     Step 4 : Search-based concept locationSou...
A Fast Algorithm   to Locate         Our approach  Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimilian...
A Fast Algorithm   to Locate         Our approach  Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimilian...
A Fast Algorithm   to Locate         Empirical study  Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimil...
A Fast Algorithm   to Locate         Empirical study  Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimil...
A Fast Algorithm   to Locate         Empirical study  Concepts inExecution Traces     RQ1 ResultsSoumaya Medini,Philippe G...
A Fast Algorithm   to Locate         Empirical study  Concepts inExecution Traces     RQ1 and RQ2 ResultsSoumaya Medini,Ph...
A Fast Algorithm   to Locate         Conclusion  Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano ...
A Fast Algorithm   to Locate         Future works  Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimilian...
A Fast Algorithm   to Locate  Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di      Penta,   Ya...
A Fast Algorithm   to Locate                     Asadi, F., Penta, M. D., Antoniol, G., and Gu´h´neuc,                    ...
A Fast Algorithm   to Locate                     Lukins, S. K., Kraft, N. A., and Etzkorn, L. H. (2010).  Concepts inExecu...
Upcoming SlideShare
Loading in...5
×

SSBSE11b.ppt

118

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
118
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "SSBSE11b.ppt"

  1. 1. A Fast Algorithm to Locate Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, A Fast Algorithm to Locate Yann-Ga¨l Gu´h´neuc, e e e Concepts in Execution TracesGiuliano AntoniolIntroductionMotivation and Soumaya Medini, Philippe Galinier, Massimiliano DiProblemStatement Penta,Background Yann-Ga¨l Gu´h´neuc, Giuliano Antoniol e e eOur approachEmpirical study SOCCER Lab. & Ptidej Team, ’Ecole Polytechnique de Montr´al, e Qu´bec, Canada eConclusion University of Sannio, ItalyFuture worksReferences September 1, 2011
  2. 2. A Fast Algorithm to Locate Outline Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di Introduction Penta, Yann-Ga¨l e Gu´h´neuc, e eGiuliano Antoniol Motivation and Problem StatementIntroduction BackgroundMotivation andProblemStatement Our approachBackgroundOur approach Empirical studyEmpirical studyConclusion ConclusionFuture worksReferences Future works References 2 / 19
  3. 3. A Fast Algorithm to Locate Introduction Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e Gu´h´neuc, e e Concept location is an important task during programGiuliano Antoniol comprehension.IntroductionMotivation and A typical scenario in which concept location takes partProblemStatement a failure has been observed in a software system underBackground certain execution conditionsOur approach such conditions are hard to reproduceEmpirical study the execution trace was saved during such failure.ConclusionFuture worksReferences Result: An execution trace containing a failure is saved but we can not re-execute the same scenario. 3 / 19
  4. 4. A Fast Algorithm to Locate Motivation and Problem Statement Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e Gu´h´neuc, e eGiuliano Antoniol Concept location process is recently defined as traceIntroduction segmentation problem.Motivation andProblemStatement The textual content of methods in trace is used to splitBackground it into segments that participate in the implementationOur approach of concepts.Empirical studyConclusion Few works attempt to automatically identify concepts in aFuture works single execution trace.References 4 / 19
  5. 5. A Fast Algorithm to Locate Motivation and Problem Statement Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e Gu´h´neuc, e e Asadi et al. [Asadi et al., 2010] presented the contributionGiuliano Antoniol to identify concepts in execution trace by finding cohesiveIntroduction and decoupled fragments of the trace.Motivation andProblemStatement LimitationsBackground Not scalable (thousands of methods).Our approach Based on methaheuristic search: each run may produceEmpirical study a different concept assignment.ConclusionFuture works Does not assign a label to the identified segments.References Purpose: To propose a novel approach to overcome these limitations. 5 / 19
  6. 6. A Fast Algorithm to Locate Background Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e Gu´h´neuc, e eGiuliano AntoniolIntroduction StepsMotivation andProblemStatement Step 1 : System instrumentation and trace collection.Background Step 2 : Pruning and compressing traces.Our approach Step 3 : Textual analysis of Method source code.Empirical studyConclusion Step 4 : Search-based concept location.Future worksReferences 6 / 19
  7. 7. A Fast Algorithm to Locate Background Concepts inExecution Traces Step 1 : System instrumentation and trace collectionSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e Gu´h´neuc, e eGiuliano Antoniol System instrumented using MODEC : tool to extractIntroduction and model sequence diagrams.Motivation andProblemStatement Trace Collection : Label manually a parts of the codeBackground during execution of the instrumented system.Our approachEmpirical study TraceConclusion Trace represented as an ordered list of invoked methods withFuture works labels.References 7 / 19
  8. 8. A Fast Algorithm to Locate Background Concepts inExecution Traces Step 2 : Pruning and compressing tracesSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e Gu´h´neuc, e eGiuliano Antoniol Pruning: Remove too frequent methods havingIntroductionMotivation and invocation greater than Q3 + 2 x IQR, with Q3 is 75%.ProblemStatement Compression: Remove repetitions of method invocationsBackground using Run Length Encoding (RLE) algorithm.Our approachEmpirical studyConclusionFuture worksReferences 8 / 19
  9. 9. A Fast Algorithm to Locate Background Concepts inExecution Traces Step 3 : Textual analysis of Method source codeSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e Gu´h´neuc, e eGiuliano Antoniol Textual analysis of source code:Introduction Extract terms from source code : identifiers andMotivation and comments.ProblemStatement Split terms by Camel-Case.Background Remove programming language keywords and englishOur approach stop words.Empirical study Perform stemming.Conclusion Index using the tf-idf indexing mechanisms.Future works Apply Latent Semantic Indexing (LSI) to reduce theReferences term-document matrix. 9 / 19
  10. 10. A Fast Algorithm to Locate Background Concepts inExecution Traces Step 4 : Search-based concept locationSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e Gu´h´neuc, e eGiuliano AntoniolIntroductionMotivation and Approach built upon dynamic programming algorithm to:ProblemStatement Compute the exact split of the execution trace intoBackground segments.Our approach Improve scalability.Empirical studyConclusionFuture worksReferences 10 / 19
  11. 11. A Fast Algorithm to Locate Our approach Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di Dynamic programming : based on overlapping sub-problems Penta, Yann-Ga¨l e by dividing a problem into sub-problems recursively solved. Gu´h´neuc, e eGiuliano Antoniol The solution of the problem is obtained by combining the solutions of the sub-problems.Introduction We compute the quality of the segmentation of a trace splitMotivation andProblem into K segments using the fitness function.StatementBackgroundOur approach end(l)−1 end(l) i=begin(l) j=i+1 σ(mi ,mj )Empirical study COHl = (1) (end(l)−begin(l)+1)·(end(l)−begin(l))/2Conclusion end(l) lFuture works i=begin(l) j=1,j<begin(l) or j>end(l) σ(mi ,mj ) COUl = (2)References (N−(end(l)−begin(l)+1))·(end(l)−begin(l)+1) K 1 COHi fitness = · (3) K COUi + 1 i=1 11 / 19
  12. 12. A Fast Algorithm to Locate Our approach Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e Gu´h´neuc, e eGiuliano Antoniol Figure: Trace splittingIntroductionMotivation andProblemStatementBackground PossibiliesOur approach new segment is added orEmpirical study the method is attached to the last solution segment.ConclusionFuture works Coh(i)= coh(i-1)+ deltaReferences Cou(i)= cou(i-1)- delta delta = sim(m1,m3)+sim(m1,m4) 12 / 19
  13. 13. A Fast Algorithm to Locate Empirical study Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e Gu´h´neuc, e eGiuliano Antoniol RQ1 : How do the performances of the GA and DPIntroduction approaches compare in terms of fitness values,Motivation andProblem convergence times, and numbers of segments?StatementBackground RQ2 : How do the GA and DP approaches perform inOur approach terms of overlaps be- tween the automatic segmentationEmpirical study and the manually-built oracle, i.e., recall?Conclusion RQ3 : How do the precision values of the GA and DPFuture works approaches compare when splitting execution traces?References 13 / 19
  14. 14. A Fast Algorithm to Locate Empirical study Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Table: Data of the empirical study. Yann-Ga¨l e Gu´h´neuc, e e Compressed SizesGiuliano Antoniol Cleaned Sizes Original SizeIntroductionMotivation andProblemStatementBackground Systems ScenariosOur approach (1)Start, Create note, Stop 34,746 821 588 ArgoUML v0.18.1 (2) Start, Create class,Empirical study 64,947 1,066 764 Create note, StopConclusion (1) Start, Draw rectangle, 6,668 447 240Future works Stop (2) Start, Add text, DrawReferences 13,841 753 361 JHotDraw v1.5 rectangle, Stop (3) Start, Draw rectangle, 11,215 1,206 414 Cut rectangle, Stop (4) Start, Spawn window, 16,366 670 433 Draw circle, Stop 14 / 19
  15. 15. A Fast Algorithm to Locate Empirical study Concepts inExecution Traces RQ1 ResultsSoumaya Medini,Philippe Galinier,Massimiliano Di Compare the GA approach proposed by Asadi et al. Penta, Yann-Ga¨l e [Asadi et al., 2010] with our approach. Gu´h´neuc, e eGiuliano AntoniolIntroduction Table: Number of segments, values of fitnessMotivation and function/segmentation score, and times required by the GA andProblemStatement DP approaches.Background # of Segments Fitness Time (s) System ScenarioOur approach GA DP GA DP GA DP (1) 24 13 0.54 0.58 7,080 2.13Empirical study ArgoUML (2) 73 19 0.52 0.60 10,800 4.33Conclusion (1) 17 21 0.39 0.67 2,040 0.13Future works (2) 21 21 0.38 0.69 1,260 0.64 JHotDraw (3) 56 20 0.46 0.72 1,200 0.86References (4) 63 26 0.34 0.69 240 1.00 The DP approach performs significantly better than the GA approach. 15 / 19
  16. 16. A Fast Algorithm to Locate Empirical study Concepts inExecution Traces RQ1 and RQ2 ResultsSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e Gu´h´neuc, e eGiuliano Antoniol Table: Jaccard overlaps and precision values between segments identified by the GA and DP approaches.IntroductionMotivation and Jaccard Precision System Scenario FeatureProblem GA DP GA DPStatement (1) Create Note 0.33 0.87 1.00 0.99Background ArgoUML (2) Create Class 0.26 0.53 1.00 1.00Our approach (2) Create Note 0.34 0.56 1.00 1.00 (1) Draw Rectangle 0.90 0.75 0.90 1.00Empirical study (2) Add Text 0.31 0.33 0.36 0.39Conclusion (2) Draw Rectangle 0.62 0.52 0.62 1.00Future works (3) Draw Rectangle 0.74 0.24 0.79 0.24 JHotDraw (3) Cut Rectangle 0.22 0.31 1.00 1.00References (4) Draw Circle 0.82 0.82 0.82 1.00 (4) Spawn window 0.42 0.44 1.00 1.00 16 / 19
  17. 17. A Fast Algorithm to Locate Conclusion Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e We reformulate the trace segmentation problem as a Gu´h´neuc, e eGiuliano Antoniol dynamic programming (DP) problem we showed that we can benefit from the overlappingIntroduction sub-problemsMotivation andProblem we obtained a dramatic gain in performance reusingStatement computed scores of intervals and segmentation scoresBackground The DP approach reuses pre-computed cohesion andOur approachEmpirical study coupling values among subsequent segments of anConclusion execution trace, which is not possible using GA.Future works Results showed that the DP approach significantlyReferences out-performed the GA approach in terms of: the times required to produce the segmentations. the scalability. 17 / 19
  18. 18. A Fast Algorithm to Locate Future works Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e Gu´h´neuc, e eGiuliano AntoniolIntroduction Validate the scalability of the DP trace segmentationMotivation and approach using more larger traces.ProblemStatement Use more traces obtained from different systems, toBackground verify the generality of our findings.Our approach Complement the approach with segment labeling toEmpirical study make the produced segments more suitable forConclusionFuture works program-comprehension activities.References 18 / 19
  19. 19. A Fast Algorithm to Locate Concepts inExecution TracesSoumaya Medini,Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e Gu´h´neuc, e eGiuliano AntoniolIntroductionMotivation andProblemStatement THANKS !BackgroundOur approachEmpirical studyConclusionFuture worksReferences 19 / 19
  20. 20. A Fast Algorithm to Locate Asadi, F., Penta, M. D., Antoniol, G., and Gu´h´neuc, e e Concepts inExecution Traces Y.-G. (2010).Soumaya Medini, A heuristic-based approach to identify concepts inPhilippe Galinier,Massimiliano Di execution traces. Penta, Yann-Ga¨l e In Proceedings of the European Conference on Software Gu´h´neuc, e eGiuliano Antoniol Maintenance and Reengineering, pages 31–40. IEEE Computer Society Press.IntroductionMotivation and Gethers, M. and Poshyvanyk, D. (2010).ProblemStatement Using relational topic models to capture coupling amongBackground classes in object-oriented software systems.Our approach In Proceedings of the 2010 IEEE InternationalEmpirical study Conference on Software Maintenance, pages 1–10. IEEEConclusion Computer Society.Future works ˜s Liu, Y., Poshyvanyk, D., Ferenc, R., GyimA¸thy, T., andReferences Chrisochoides, N. (2009). Modeling class cohesion as mixtures of latent topics. In International Conference on Software Maintenance, pages 233–242. 19 / 19
  21. 21. A Fast Algorithm to Locate Lukins, S. K., Kraft, N. A., and Etzkorn, L. H. (2010). Concepts inExecution Traces Bug localization using latent dirichlet allocation.Soumaya Medini, Information and Software Technology, 52(9):972 – 990.Philippe Galinier,Massimiliano Di Penta, Yann-Ga¨l e Gu´h´neuc, e eGiuliano AntoniolIntroductionMotivation andProblemStatementBackgroundOur approachEmpirical studyConclusionFuture worksReferences 19 / 19
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×