• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
A Survey Of Aspect Mining Approaches
 

A Survey Of Aspect Mining Approaches

on

  • 1,695 views

A survey of aspect mining approaches

A survey of aspect mining approaches

Statistics

Views

Total Views
1,695
Views on SlideShare
1,693
Embed Views
2

Actions

Likes
1
Downloads
9
Comments
0

1 Embed 2

http://www.linkedin.com 2

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    A Survey Of Aspect Mining Approaches A Survey Of Aspect Mining Approaches Presentation Transcript

    • Aspect Mining an emerging research domain Kim Mens http://www.info.ucl.ac.be/~km Special thanks to Andy Kellens, Paolo Tonella & Kris Gybels 1
    • Overview ‣ What is aspect mining and why do we need it ? ‣ Different kinds of mining approaches : 1. Early aspects 2. Advanced Browsers 3. Source-code mining ‣ Overview of automated approaches ‣ Comparison of automated code mining techniques ‣ Conclusion: limitations of aspect mining 2
    • Overview ‣ What is aspect mining and why do we need it ? ‣ Different kinds of mining approaches : 1. Early aspects 2. Advanced Browsers 3. Source-code mining ‣ Overview of automated approaches ‣ Comparison of automated code mining techniques ‣ Conclusion: limitations of aspect mining 2
    • Aspects in existing systems? ‣ Aspects offer - Better modularisation and “separation of concerns” - Improved “ilities” ‣ Also useful for already existing systems - But how to migrate a legacy system to AOSD? 3
    • Migrating a System 4
    • Migrating a System Aspect Mining 4
    • Migrating a System Aspect Mining Aspect Refactoring 4
    • Why do we need aspect mining? ‣ Legacy systems - large, complex systems - not always clearly documented - need to find the crosscutting concerns (what?) - need to find the extent of the crosscutting concerns (where?) - program understanding/documentation 5
    • Why do we need aspect mining? ‣ Legacy systems - large, complex systems - not always clearly documented - need to find the crosscutting concerns (what?) - need to find the extent of the crosscutting concerns (where?) - program understanding/documentation 5
    • Overview ‣ What is aspect mining and why do we need it ? ‣ Different kinds of mining approaches : 1. Early aspects 2. Advanced Browsers 3. Source-code mining ‣ Overview of automated approaches ‣ Comparison of automated code mining techniques ‣ Conclusion: limitations of aspect mining 6
    • Different Kinds of Mining Approaches 1.“Early aspect” discovery techniques 2.Advanced special-purpose browsers 3.(Semi-)automated code-level mining techniques 7
    • Early Aspects ‣ Try to find the aspects in artifacts of the early phases of the software life-cycle : - requirements - design ‣ Less useful when trying to identify aspects in legacy code 8
    • Different Kinds of Mining Approaches 1.“Early aspect” discovery techniques 2.Advanced special-purpose browsers 3.(Semi-)automated code-level mining techniques 9
    • Special-Purpose Code Browsers ‣ Idea: help a developer in exploring a concern ‣ By browsing / navigating the source code - User provides interesting “seed” in the code - Tool suggests interesting related points to consider ‣ Iteratively, build up a model of the crosscutting concerns ‣ Examples: - FEAT, Aspect Browser, (Extended) Aspect Mining Tool, Prism, JQuery, Soul 10
    • Special-Purpose Code Browsers ‣ Idea: help a developer in exploring a concern ‣ By browsing / navigating the source code - User provides interesting “seed” in the code - Tool suggests interesting related points to consider ‣ Iteratively, build up a model of the crosscutting concerns ‣ Examples: - FEAT, Aspect Browser, (Extended) Aspect Mining Tool, Prism, JQuery, Soul 10
    • FEAT ‣ Create a “Concern Graph”, a representation of the concern mapping back to the source code. ‣ Consists out of a set of program entities ‣ Start out with a number of program entities (e.g. “all callers of the method print()”) ‣ Iteratively explore the relations and refine the concern graph - fan-in (callers, ...) - fan-out (callees, ...) 11
    • FEAT 12
    • Special-Purpose Code Browsers ‣ Disadvantages : - Require quite some manual effort (does it scale?) ➡ Try to automate this process - Requires some preliminary knowledge about system (need for seeds) 13
    • Different Kinds of Mining Approaches 1.“Early aspect” discovery techniques 2.Advanced special-purpose browsers 3.(Semi-)automated code-level mining techniques 14
    • Automated Approaches ‣ Idea: Semi-automatically find possible aspect candidates in source code of a system ‣ No magic: - Manual filtering of the results - False positives/negatives - May require preliminary knowledge ‣ Possibility to use results as seed for browsers 15
    • Automated Approaches ‣ Hidden assumption of all techniques : - look for symptoms of scattering and tangling - either static (code) or dynamic (run-time) ‣ Discover these symptoms using a variety of data analysis techniques - inspired by data mining, code analysis, ... - but specifically (re)designed for aspect mining 16
    • Overview ‣ What is aspect mining and why do we need it ? ‣ Different kinds of mining approaches : 1. Early aspects 2. Advanced Browsers 3. Source-code mining ‣ Overview of automated approaches ‣ Comparison of automated code mining techniques ‣ Conclusion: limitations of aspect mining 17
    • Some Automated Approaches ‣ Pattern matching • Analysing recurring patterns of execution traces • Clone detection (token/PDG/AST-based) ‣ Formal concept analysis ‣ Execution traces / identifiers ‣ Natural language processing on code ‣ AOP idioms ‣ Fan-in analysis / Detecting unique methods ‣ Cluster analysis ‣ method names / method invocations ‣ ... and many more ... 18
    • Some Automated Approaches ‣ Pattern matching • Analysing recurring patterns of execution traces • Clone detection (token/PDG/AST-based) ‣ Formal concept analysis ‣ Execution traces / identifiers ‣ Natural language processing on code ‣ AOP idioms ‣ Fan-in analysis / Detecting unique methods ‣ Cluster analysis ‣ method names / method invocations ‣ ... and many more ... 18
    • Recurring Patterns [Breu&Krinke] ‣ Assumption: if certain method calls always happen before/after the same call, they might be part of an aspect ‣ “Reverse engineering” of manual implementation of before/after advice ‣ Technique: find pairs of methods which happen right before/after each other and check wether this pattern recurs in the code 19
    • Breu and Krinke propose an aspect mining technique n (Dynamic Aspect Mining Tool) [13], which analyses pr flecting the run-time behaviour of a system in search of Recurring Patterns tion patterns. To do so, they introduce the notion of ex [Breu&Krinke] between method invocations. Consider the following exa trace, where the capitals represent method names: B() { C() { G() {} H() {} } } A() {} Breu and Krinke distinguish between four different exe Inherently dynamic (analysis of execution is called but: A), outside-after outside-before (e.g., B trace), before ‣ static : use control-flowlast call in C).(e.g., Gthese execution in C) and in after B), inside-first graphs to calculateis the first call relations, th calling relations - is the Using - hybrid : augmentrithm discovers aspect candidates based on recurring pa dynamic information with static type information to improve results an execution relation occurs more than invocations. If uniformly (for instance, every invocation of method B i CCC needs to be rigorously implemented (calls always in same an aspe ‣ invocation of method A), it is considered to be order) ensure that the aspect candidates are sufficiently cross- 20 an extra requirement that the recurring relations shou
    • Some Automated Approaches ‣ Pattern matching • Analysing recurring patterns of execution traces • Clone detection (token/PDG/AST-based) ‣ Formal concept analysis ‣ Execution traces / identifiers ‣ Natural language processing on code ‣ AOP idioms ‣ Fan-in analysis / Detecting unique methods ‣ Cluster analysis ‣ method names / method invocations ‣ ... and many more ... 21
    • Some Automated Approaches ‣ Pattern matching • Analysing recurring patterns of execution traces • Clone detection (token/PDG/AST-based) ‣ Formal concept analysis ‣ Execution traces / identifiers ‣ Natural language processing on code ‣ AOP idioms ‣ Fan-in analysis / Detecting unique methods ‣ Cluster analysis ‣ method names / method invocations ‣ ... and many more ... 21
    • Clone Detection [Bruntink, Shepherd] ‣ Assumption: crosscutting concerns result in code duplication ‣ Certain CCCs are implemented using a recurring pattern, idiom ‣ Technique: find classes of related code - exact same - same structure; different literals 22
    • only the lines marked with ‘C’ are included in the clones. a cer- CCFinder allows clones to start and end with little regard to Clone Detection e clone syntactic units. In contrast, Bauhaus’ ccdiml does not allow ce pre- [Bruntink, Shepherd] this, due to its AST-based clone detection algorithm. differ- n order M C if (r != OK) , preci- M C{ M C ERXA_LOG(r, 0, (quot;PLXAmem_malloc failure.quot;)); re these M C M C ERXA_LOG(VSXA_MEMORY_ERR, r, M C (quot;%s: failed to allocated %d bytes.quot;, etectors M func_name, toread)); overage M M r = VSXA_MEMORY_ERR; fraction M } the first Figure 3. CCFinder clone covering memory error thm de- source code analysis ‣ handling. in Fig- Finder, ‣ applicable if the CCCs are implemented as a pattern Furthermore this clone class does not cover memory er- by the ror handling code exclusively. In Figure 2(d), note that the ollows: precision obtained for the 23 clone class is roughly 82%. first Through inspection of the code we found that some of the
    • Some Automated Approaches ‣ Pattern matching • Analysing recurring patterns of execution traces • Clone detection (token/PDG/AST-based) ‣ Formal concept analysis ‣ Execution traces / identifiers ‣ Natural language processing on code ‣ AOP idioms ‣ Fan-in analysis / Detecting unique methods ‣ Cluster analysis ‣ method names / method invocations ‣ ... and many more ... 24
    • Some Automated Approaches ‣ Pattern matching • Analysing recurring patterns of execution traces • Clone detection (token/PDG/AST-based) ‣ Formal concept analysis ‣ Execution traces / identifiers ‣ Natural language processing on code ‣ AOP idioms ‣ Fan-in analysis / Detecting unique methods ‣ Cluster analysis ‣ method names / method invocations ‣ ... and many more ... 24
    • Formal Concept Analysis • Starts from • a set of elements • a set of properties of those elements • Determines concepts • Maximal groups of elements and properties • Group: • Every element of the concept has those properties • Every property of the concept holds for those elements • Maximal • No other element (outside the concept) has those same properties • No other property (outside the concept) is shared by all elements 25
    • Formal Concept Analysis object- static dynamic functional logic oriented typing typing C++ X - - X - Java X - - X - Smalltalk X - - - X Scheme - X - - X Prolog - - X - X 26
    • Formal Concept Analysis object- static dynamic functional logic oriented typing typing C++ X - - X - Java X - - X - Smalltalk X - - - X Scheme - X - - X Prolog - - X - X 26
    • Formal Concept Analysis object- static dynamic functional logic oriented typing typing C++ X - - X - Java X - - X - Smalltalk X - - - X Scheme - X - - X Prolog - - X - X 26
    • Formal Concept Analysis object- static dynamic functional logic oriented typing typing C++ X - - X - Java X - - X - Smalltalk X - - - X Scheme - X - - X Prolog - - X - X 26
    • Formal Concept Analysis 27
    • Identifier Analysis (Mens & Tourwé) ‣ Assumption: methods belonging to the same concern are implemented using similar names ‣ Eg. methods implementing Undo concern will have “undo” in their signature ‣ Technique: Use Formal Concept Analysis with the methods as objects and the different substrings as properties 28
    • Identifier Analysis figu drawi reque remo upda chan eve … re ng st ve te ge nt drawingRequestUpdate - X X - X - - … (DrawingChangeEvent e) figureRequestRemove X - X X - - - … (FigureChangeEvent e) figureRequestUpdate X - X - X - - … (FigureChangeEvent e) figureRequestRemove X - X X - - - … (FigureChangeEvent e) figureRequestUpdate X - X - X - - … (FigureChangeEvent e) … … … X … … … … … ‣ Lots of concepts: filter irrelevant ones ‣ Group “similar” concepts 29
    • Execution traces (ceccato & tonella) ‣ Assumption: use case scenario’s are a good indicator for crosscutting concerns ‣ Technique: Analyse execution trace using FCA with use cases as objects; methods called from within use case as attributes ‣ Concept specific to one use case is possible aspect if: - Methods belong to more than one class (scattering) - Methods of same class occur in multiple use cases (tangling) 30
    • Some Automated Approaches ‣ Pattern matching • Analysing recurring patterns of execution traces • Clone detection (token/PDG/AST-based) ‣ Formal concept analysis ‣ Execution traces / identifiers ‣ Natural language processing on code ‣ AOP idioms ‣ Fan-in analysis / Detecting unique methods ‣ Cluster analysis ‣ method names / method invocations ‣ ... and many more ... 31
    • Fan-in Analysis (Marin) ‣ Assumption: The fan-in metric is a good indicator of scattering ‣ “Methods which get called often from different contexts are possible crosscutting concerns” ‣ Technique: calculate the fan-in of each method, filter auxiliary methods/accessors and sort the resulting methods 32
    • Unique Methods (Gybels & kellens) ‣ Assumption: Some CCCs are implemented by calling a single entity from all over the code ‣ E.g. Logging ‣ Technique: devise a metric (unique method) which finds such crosscutting concerns 33
    • Unique Methods (Gybels & kellens) Logging log() ClassB Unique Method A method without a return value which implements a message ClassA ClassC implemented by no other method. Q: Why no return type? A: Use of value in base computation 34
    • Unique Methods (Gybels & kellens) Class Selector Calls Parcel #markAsDirty 23 ParagraphEditor #resetTypeIn 19 UIPainter #broadCastPendingSelectionChange 18 ControllerCodeRegenerator #pushPC 15 AbstractChangeList #updateSelection: 15 PundleModel #updateAfterDo: 10 ??????????????????????? #beGarbageCollectable ?? 35
    • Some Automated Approaches ‣ Pattern matching • Analysing recurring patterns of execution traces • Clone detection (token/PDG/AST-based) ‣ Formal concept analysis ‣ Execution traces / identifiers ‣ Natural language processing on code ‣ AOP idioms ‣ Fan-in analysis / Detecting unique methods ‣ Cluster analysis ‣ method names / method invocations ‣ ... and many more ... 36
    • Some Automated Approaches ‣ Pattern matching • Analysing recurring patterns of execution traces • Clone detection (token/PDG/AST-based) ‣ Formal concept analysis ‣ Execution traces / identifiers ‣ Natural language processing on code ‣ AOP idioms ‣ Fan-in analysis / Detecting unique methods ‣ Cluster analysis ‣ method names / method invocations ‣ ... and many more ... 36
    • Cluster Analysis ‣ Input: - a set of objects - a distance measure between those objects ‣ Output: - groups of objects which are close (according to the distance function) 37
    • Aspect Mining Using Cluster Analysis (He, Shepherd) ‣ Distance function between two methods ‣ Call Clustering: - distance in function of times they occur together in same method (cfr. recurring execution traces) ‣ Method Clustering: - distance in function of commonalities in name of methods (cfr. identifier analysis) 38
    • Some Automated Approaches ‣ Pattern matching • Analysing recurring patterns of execution traces • Clone detection (token/PDG/AST-based) ‣ Formal concept analysis ‣ Execution traces / identifiers ‣ Natural language processing on code ‣ AOP idioms ‣ Fan-in analysis / Detecting unique methods ‣ Cluster analysis ‣ method names / method invocations ‣ ... and many more ... 39
    • Some Automated Approaches ‣ Pattern matching • Analysing recurring patterns of execution traces • Clone detection (token/PDG/AST-based) ‣ Formal concept analysis ‣ Execution traces / identifiers ‣ Natural language processing on code ‣ AOP idioms ‣ Fan-in analysis / Detecting unique methods ‣ Cluster analysis ‣ method names / method invocations ‣ ... and many more ... 39
    • Overview ‣ What is aspect mining and why do we need it ? ‣ Different kinds of mining approaches : 1. Early aspects 2. Advanced Browsers 3. Source-code mining ‣ Overview of automated approaches ‣ Comparison of automated code mining techniques ‣ Conclusion: limitations of aspect mining 40
    • For each of the studied techniques, Table 2 shows the kind of data (static or dynamic) analysed by that technique, as well as the kind of analysis Comparison performed (token-based or structural/behavioural). Kind of input data Kind of analysis static dynamic token-based structural/behavioural Execution patterns X X - X Dynamic analysis - X - X Identifier analysis X - X - Language clues X - X - Unique methods X - - X Method clustering X - X - Call clustering X - - X Fan-in analysis X - - X Clone detection (PDG) X - - X Clone detection (token) X - X - Clone detection (AST) X - - X Table 2. Kind of input data and kind of analysis of each technique Kind of input data and kind of analysis Most techniques work on statically available data. ‘Dynamic analysis’ reasons about execution traces and thus requires executability of the 41 code under analysis. Only ‘Execution patterns’ works with both kinds
    • For each of the studied techniques, Table 2 shows the kind of data (static or dynamic) analysed by that technique, as well as the kind of analysis Comparison performed (token-based or structural/behavioural). Kind of input data Kind of analysis static dynamic token-based structural/behavioural Execution patterns X X - X Dynamic analysis - X - X Identifier analysis X - X - few ic Language clues X - X - Unique methods X - - X nam Method clustering X - X - dy Call clustering X - - X Fan-in analysis X - - X Clone detection (PDG) X - - X Clone detection (token) X - X - Clone detection (AST) X - - X Table 2. Kind of input data and kind of analysis of each technique Kind of input data and kind of analysis Most techniques work on statically available data. ‘Dynamic analysis’ reasons about execution traces and thus requires executability of the 41 code under analysis. Only ‘Execution patterns’ works with both kinds
    • For each of the studied techniques, Table 2 shows the kind of data (static or dynamic) analysed by that technique, as well as the kind of analysis Comparison performed (token-based or structural/behavioural). Kind of input data Kind of analysis static dynamic token-based structural/behavioural X ok o s lX Execution patterns X X - e qu - Dynamic analysis - X - ni ns - Identifier analysis X - X ech oke ore few ic tt Language clues X - X - me t at m info Unique methods X - X SX nly a ok av. nam o Method clustering X - - dy -o e lo ./behX Call clustering X - om uct Fan-in analysis X - - X -S tr Clone detection (PDG) X - X s Clone detection (token) X - X - Clone detection (AST) X - - X Table 2. Kind of input data and kind of analysis of each technique Kind of input data and kind of analysis Most techniques work on statically available data. ‘Dynamic analysis’ reasons about execution traces and thus requires executability of the 41 code under analysis. Only ‘Execution patterns’ works with both kinds
    • Comparison Granularity Symptoms method code fragment scattering tangling Execution patterns X - X - Dynamic analysis X - X X Identifier analysis X - X - Language clues X - X - Unique methods X - X - Method clustering X - X - Call clustering X - X - Fan-in analysis X - X - Clone detection - X X - ble 3. Granularity of and symptoms looked for by each techn Granularity of and symptoms looked for by each technique 42 e 3 summarises the finest level of granularity (methods or code
    • Comparison Granularity Symptoms method code fragment scattering tangling Execution patterns X - X - Dynamic analysis X - X X Identifier analysis X - X - wl elo - ve Language clues X X - b l-e X few d- Unique methods X - tho - Method clustering X X - e Xm Call clustering - X - Fan-in analysis X - X - Clone detection - X X - ble 3. Granularity of and symptoms looked for by each techn Granularity of and symptoms looked for by each technique 42 e 3 summarises the finest level of granularity (methods or code
    • Comparison Granularity Symptoms method code fragment scattering tangling Execution patterns X - X - Dynamic analysis X - X X Identifier analysis X - X - wl elo - ve ew ng Language clues X X - b l-e X f li - X few d- Unique methods X tang - tho - Method clustering X me Call clustering X - X - Fan-in analysis X - X - Clone detection - X X - ble 3. Granularity of and symptoms looked for by each techn Granularity of and symptoms looked for by each technique 42 e 3 summarises the finest level of granularity (methods or code
    • both scattering and tangling into account, by requiring that the methods which occur in a single use-case scenario are implemented in multiple classes (scattering), but also that these methods occur in multiple use- Comparison cases and thus are tangled with other concerns of the system. Technique Largest case Size case Empirically validated Execution patterns Graffiti 3100 methods/82KLOC - Dynamic analysis JHotDraw 2800 methods/18KLOC - Identifier analysis JHotDraw 2800 methods/18KLOC - Language clues PetStore 10KLOC - Unique methods Smalltalk image 3400 classes/66000 methods - Method clustering JHotDraw 2800 methods/18KLOC - Call clustering Banking example 12 methods - Fan-in analysis JHotDraw 2800 methods/18KLOC - TomCat 5.5 API 172KLOC - Clone detection (PDG) TomCat 38KLOC - Clone detection (AST/token) ASML C-Code 20KLOC X Table 4. An assessment of the validation of each technique An assessment of the validation of each technique To provide more insights into the 43 validation of the techniques, Table 4 mentions the largest case on which each technique has been validated,
    • both scattering and tangling into account, by requiring that the methods which occur in a single use-case scenario are implemented in multiple classes (scattering), but also that these methods occur in multiple use- Comparison cases and thus are tangled with other concerns of the system. Technique Largest case Size case Empirically validated Execution patterns Graffiti 3100 methods/82KLOC - Dynamic analysis JHotDraw 2800 methods/18KLOC - Identifier analysis JHotDraw 2800 methods/18KLOC - on ? Language clues PetStore 10KLOC - mm ark Unique methods Smalltalk image 3400 classes/66000 methods - co hm Method clustering JHotDraw 2800 methods/18KLOC - enc Call clustering Banking example 12 methods - b Fan-in analysis JHotDraw 2800 methods/18KLOC - TomCat 5.5 API 172KLOC - Clone detection (PDG) TomCat 38KLOC - Clone detection (AST/token) ASML C-Code 20KLOC X Table 4. An assessment of the validation of each technique An assessment of the validation of each technique To provide more insights into the 43 validation of the techniques, Table 4 mentions the largest case on which each technique has been validated,
    • both scattering and tangling into account, by requiring that the methods which occur in a single use-case scenario are implemented in multiple classes (scattering), but also that these methods occur in multiple use- Comparison cases and thus are tangled with other concerns of the system. Technique Largest case Size case Empirically validated Execution patterns Graffiti 3100 methods/82KLOC - Dynamic analysis JHotDraw 2800 methods/18KLOC - Identifier analysis JHotDraw 2800 methods/18KLOC - toy es on ? Language clues PetStore 10KLOC - mm ark no pl Unique methods Smalltalk image 3400 classes/66000 methods - co hm am Method clustering JHotDraw 2800 methods/18KLOC - ex enc Call clustering Banking example 12 methods - b Fan-in analysis JHotDraw 2800 methods/18KLOC - TomCat 5.5 API 172KLOC - Clone detection (PDG) TomCat 38KLOC - Clone detection (AST/token) ASML C-Code 20KLOC X Table 4. An assessment of the validation of each technique An assessment of the validation of each technique To provide more insights into the 43 validation of the techniques, Table 4 mentions the largest case on which each technique has been validated,
    • both scattering and tangling into account, by requiring that the methods which occur in a single use-case scenario are implemented in multiple classes (scattering), but also that these methods occur in multiple use- Comparison cases and thus are tangled with other concerns of the system. Technique Largest case Size case Empirically validated Execution patterns Graffiti 3100 methods/82KLOC - Dynamic analysis JHotDraw 2800 methods/18KLOC - ttle cal Identifier analysis JHotDraw 2800 methods/18KLOC - toy es on ? li i Language clues PetStore 10KLOC - mm ark no pl pir tion Unique methods Smalltalk image 3400 classes/66000 methods - co hm em ida am Method clustering JHotDraw 2800 methods/18KLOC - ex enc val Call clustering Banking example 12 methods - b Fan-in analysis JHotDraw 2800 methods/18KLOC - TomCat 5.5 API 172KLOC - Clone detection (PDG) TomCat 38KLOC - Clone detection (AST/token) ASML C-Code 20KLOC X Table 4. An assessment of the validation of each technique An assessment of the validation of each technique To provide more insights into the 43 validation of the techniques, Table 4 mentions the largest case on which each technique has been validated,
    • that technique makes about how the crosscutting concerns are imple- mented. Table 5 summarises these assumptions in terms of preconditions that a system has to satisfy in order to find suitable aspect candidates Comparison (4) with a given technique. Technique Preconditions on crosscutting concerns in the analysed program Execution patterns Order of calls in context of crosscutting concern is always the same. Dynamic analysis At least one use case exists that exposes the crosscutting concern and another one that does not. Identifier analysis Names of methods implementing the concern are alike. Language clues Context of concern contains keywords which are synonyms for the crosscutting concern. Unique methods Concern is implemented by exactly one method. Method clustering Names of methods implementing the concern are alike. Call clustering Concern is implemented by calls to same methods from different modules. Fan-in analysis Concern is implemented in separate method which is called a high number of times, or many methods implementing the concern call the same method. Clone detection Concern is implemented by reusing a certain code fragment. able 5. What conditions does the implementation of the concerns have to satisfy What conditions does the implementation of the concerns der for a technique to find viable aspect candidates? have to satisfy in order to find viable aspect candidates? 44 ‘Identifier Analysis’, ‘Method Clustering’, ‘Language Clues’ and ‘Token-
    • that technique makes about how the crosscutting concerns are imple- mented. Table 5 summarises these assumptions in terms of preconditions that a system has to satisfy in order to find suitable aspect candidates Comparison (4) with a given technique. Technique Preconditions on crosscutting concerns in the analysed program Execution patterns Order of calls in context of crosscutting concern is always the same. Dynamic analysis At least one use case exists that exposes the crosscutting concern ake ns and another one that does not. s m ptio e Identifier analysis Names of methods implementing the concern are alike. ue m Language clues Context of concern contains keywords which are synonyms for the od iq su crosscutting concern. chn t as rce c Unique methods Concern is implemented by exactly one method. Te en sou Method clustering Names of methods implementing the concern are alike. ffer t the Call clustering Concern is implemented by calls to same methods from different di u modules. bo Fan-in analysis Concern is implemented in separate method which is called a high a number of times, or many methods implementing the concern call the same method. Clone detection Concern is implemented by reusing a certain code fragment. able 5. What conditions does the implementation of the concerns have to satisfy What conditions does the implementation of the concerns der for a technique to find viable aspect candidates? have to satisfy in order to find viable aspect candidates? 44 ‘Identifier Analysis’, ‘Method Clustering’, ‘Language Clues’ and ‘Token-
    • orously made use of naming conventions when implementing the cross- cutting concerns. ‘Execution Patterns’ and ‘Call Clustering’ assume that methods which often get called together from within different contexts Comparison are candidate aspects. The fan-in technique assumes that crosscutting concerns are implemented by methods which are called many times (large footprint), or by methods calling such methods. Technique User Involvement Execution patterns Inspection of the resulting “recurring patterns”. Dynamic analysis Selection of use cases and manual interpretation of results. Identifier analysis Browsing of mined aspects using IDE integration. Language clues Manual interpretation of resulting lexical chains. Unique methods Inspection of the unique methods; eased by sorting on importance. Method clustering Browsing of mined aspects using IDE integration. Call clustering Manual inspection of resulting clusters. Fan-in analysis Selection of candidates from list of methods, sorted on highest fan-in. Clone detection Browsing and manual interpretation of the discovered clones. Table 6. Which kind of user involvement do the different techniques require? Table 6 summarises the kind of involvement that is required from the Which kind of user involvement user. None dothe existing techniques works fully automatic. All tech- of the different techniques require? niques require that their users browse through the resulting aspect can- 45 didates in order to find suitable aspects. Some require that the users
    • orously made use of naming conventions when implementing the cross- cutting concerns. ‘Execution Patterns’ and ‘Call Clustering’ assume that methods which often get called together from within different contexts Comparison are candidate aspects. The fan-in technique assumes that crosscutting concerns are implemented by methods which are called many times (large footprint), or by methods calling such methods. Technique User Involvement till Execution patterns Inspection of the resulting “recurring patterns”. ts Dynamic analysis Selection of use cases and manual interpretation of results. for l ef red Identifier analysis Browsing of mined aspects using IDE integration. ua ui Language clues Manual interpretation of resulting lexical chains. an eq Unique methods Inspection of the unique methods; eased by sorting on importance. Mr Method clustering Browsing of mined aspects using IDE integration. Call clustering Manual inspection of resulting clusters. Fan-in analysis Selection of candidates from list of methods, sorted on highest fan-in. Clone detection Browsing and manual interpretation of the discovered clones. Table 6. Which kind of user involvement do the different techniques require? Table 6 summarises the kind of involvement that is required from the Which kind of user involvement user. None dothe existing techniques works fully automatic. All tech- of the different techniques require? niques require that their users browse through the resulting aspect can- 45 didates in order to find suitable aspects. Some require that the users
    • Taxonomy method fragments AST-based Clone Detection Clone PDG-based Detection Clone Detection Token-based Clone Detection Structural / Token- Method Behavioral Based Clustering Identifier Analysis Language Clustering method-level Clues Unique Methods Concept Fan-in Analysis Analysis Call Clustering Execution Patterns Static Dynamic Analysis Dynamic 46
    • Overview ‣ What is aspect mining and why do we need it ? ‣ Different kinds of mining approaches : 1. Early aspects 2. Advanced Browsers 3. Source-code mining ‣ Overview of automated approaches ‣ Comparison of automated code mining techniques ‣ Conclusion: limitations of aspect mining 47
    • Summary (1) ‣ Aspect Mining - Identification of crosscutting concerns in legacy systems - Aid developers when migrating an existing application to aspects - Browsing approaches and automated techniques 48
    • Summary(2) ‣ Different techniques are complementary - Different input (dynamic vs. static); granularity - Rely on different assumptions/symptoms - Make use of different analysis techniques • cluster analysis, FCA, heuristics, ... ‣ When mining a system, apply (a combination of) different techniques 49
    • Possible research directions ‣ Quantitative comparison of techniques: - Which aspects can be detected by which technique? - Common benchmark/analysis framework ‣ Cross-fertilization with other research domains - slicing, metrics, ... 50
    • Limitations ‣ There is no “silver bullet” - Still a lot of manual intervention required - False positives / false negatives still remain - Are we really mining for aspects? • or is it “joinpoint mining”? 51
    • Limitations ‣ Joinpoint mining? - how to obtain the pointcut definition? - how obtain the advice? ‣ Scalability - user involvement ‣ Validation ‣ Granularity 52
    • More reading... ‣ A. KELLENS, K. MENS & P. TONELLA. A Survey of Automated Code- Level Aspect Mining Techniques. Submitted to Transactions on Aspect- Oriented Software Development, Special Issue on Software Evolution. 2006. ‣ M. CECCATO, M. MARIN, K. MENS, L. MOONEN, P. TONELLA & T. TOURWE. Applying and Combining Three Different Aspect Mining Techniques. Software Quality Journal, 14(3) : 209–231. Springer, September 2006. 53
    • More reading... ‣ A. KELLENS, K. MENS & P. TONELLA. A Survey of Automated Code- Level Aspect Mining Techniques. Submitted to Transactions on Aspect- Oriented Software Development, Special Issue on Software Evolution. 2006. ‣ M. CECCATO, M. MARIN, K. MENS, L. MOONEN, P. TONELLA & T. TOURWE. Applying and Combining Three Different Aspect Mining Techniques. Software Quality Journal, 14(3) : 209–231. Springer, September 2006. 53