Pitfalls In Aspect Mining


Published on

Presentation of paper on "pitfalls in aspect mining" at the Working Conference on Reverse Engineering (WCRE), Antwerp, Belgium, 2008.

The research domain of aspect mining studies the problem of (semi-)automatically identifying potential aspects and crosscutting concerns in a software system, to improve the system’s comprehensibility or enable its migration to an aspect-oriented solution. Unfortunately, most proposed aspect mining techniques have not lived up to their expectations yet. In this paper we provide a list of problems that most aspect mining techniques suffer from and identify some of the root causes underlying these problems. Based upon this analysis, we conclude that many of the problems seem to be caused directly or indirectly by the use of inappropriate techniques, a lack of rigour and semantics on what is being mined for and how, and in how the results of the mining process are presented to the user.

Published in: Science, Technology, Business
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Pitfalls In Aspect Mining

  1. 1. Pitfalls in Aspect Mining Pr. Kim Mens Dr. Andy Kellens Dr. Jens Krinke Université catholique de Louvain Vrije Universiteit Brussel King’s College London B-1348 Louvain-la-Neuve Belgium United Kingdom Belgium akellens@vub.ac.be krinke@acm.org kim.mens@uclouvain.be WCRE 2008, 15th Working Conference on Reverse Engineering October 15th – 18th, 2008 Antwerp, Belgium 1
  2. 2. What’s this paper doing here? Reverse engineering is about “recovering information from existing software and systems” WCRE studies innovative methods for extracting such information and ways of using that information for system renovation and program understanding Aspect mining tries to identify potential aspects and crosscutting concerns from existing software systems in order to improve the system's comprehensibility or to enable its migration to an aspect-oriented solution 2
  3. 3. Why did we write this paper? Partly out of frustration Prior research on aspect mining Co-authored ~8 papers since 2004, including some survey papers Variety of techniques based on FCA, clustering, clone detection, ... No satisfactory results : Why ? 3
  4. 4. Our goal Most proposed aspect mining techniques have not lived up to their expectations yet Draw list of problems that most aspect mining techniques suffer from Identify root causes underlying these problems Provide suggestions for improvements Moment of reflection on state of research in aspect mining no big “surprises” provide broader basis for discussion 4
  5. 5. Aspects in a nutshell implementing a notify/listener the public abstract class Customer { private CustomerID id; private Collection listeners; public Address getAddress() { OO return this.address; } public abstract class Customer { the public void setLastName(String name) { private CustomerID id; this.lastName = name; } way public Address getAddress() { public void setCustomerID(String id) { return this.address; } AO this.id = id; public void setLastName(String name) { notifyListeners(); } public class PrivateCustomer { this.lastName = name; } ... ... public void setCustomerID(String id) { way private String lastName; public class CorporateCustomer { this.id = id; } private String firstName; ... ... ... private String companyName; public void setLastName(String name) { private CompanyName taxNumber; this.lastName = name; public class PrivateCustomer { ... notifyListeners(); } public class CorporateCustomer { ... public void setCompanyName(String name) { public void setFirstName(String name) { ... private String lastName; this.companyName = name; this.firstName = name; private String companyName; private String firstName; notifyListeners(); } notifyListeners(); } private CompanyName taxNumber; ... public void setTaxNumber(String nr) { } ... public void setLastName(String name) { this.taxNumber = nr; public void setCompanyName(String name) { this.lastName = name; } notifyListeners(); } this.companyName = name; } public void setFirstName(String name) { } public void setTaxNumber(String nr) { this.firstName = name; } public class CustomerListener { this.taxNumber = nr; } } } public void notify(Customer modifiedCustomer) { System.out.println(quot;Customer quot; + modifiedCustomer.getID() + quot; was modifiedquot;); } public aspect ChangeNotification { } pointcut stateUpdate(Customer c) : pointcut execution(* Customer.set*(..)) && this(c); after(Customer c): stateUpdate(c) { tangling for (Iterator iterator = c.listeners.iterator(); iterator.hasNext();) { advice CustomerListener listener = (CustomerListener) iterator.next(); listener.notify(c); } code in one region addresses } sca ttering multiple concerns ... some interclass definitions here ... code addressing one concern clean separation of concerns is spread around the system 5
  6. 6. Aspect Mining Note: If you want to migrate towards aspects, aspect mining is only the first step. aspect 3 aspect 1 You still need to “extract” aspect 2 the actual aspects from the discovered aspect 6
  7. 7. Why Aspect Mining? Legacy systems large, complex systems Not always clearly documented Program understanding useful to find crosscutting concerns (what?) useful to find extent of the crosscutting concerns (where?) First step in migration to aspect-oriented solution or just to document the croscutting concerns 7
  8. 8. How does it work? (mostly) Variety of techniques from data mining, code analysis, reverse engineering specifically redesigned to identify potential aspect candidates in software source code by looking for symptoms of crosscutting concerns (scattering, tangling, code duplication, ...) Semi-automated: manual intervention required to set thresholds, fine-tune filters to apply, ... verify, select and complete reported results 8
  9. 9. Problems with aspect mining Poor precision (At different levels of granularity) Poor recall Subjectivity Scalability Empirical validation Comparability Composability 9
  10. 10. Consequence: Levels of granularity - difficult to compare - difficult to combine - technique may not return what you look for Make sure that you know what you are mining for joinpoints = places in the code that address a particular aspect aspects = what aspects are implemented in the source code crosscutting sorts = all aspects or concerns of a given kind Different techniques may work at different levels of granularity Example of aspects: Example of joinpoints: Contract enforcement = - change notification, - all mutators that notify a listener The sort of all aspects that check a synchronisation, logging (“change notification aspects”) common condition for a set of methods. Example of such an aspect; before updating a view check whether it is necessary to update. 10
  11. 11. Poor precision and poor recall Precision = relevant candidates ÷ reported candidates Poor precision => false positives => more user involvement Recall = discovered aspects ÷ all aspects Poor recall => false negatives => incomplete results Hard to calculate Recall is inversely correlated with precision Poor precision or recall occurs at different levels of granularity 11
  12. 12. Results marked with ‘M’ belong to the memory handling concer References only the lines marked with ‘C’ are included in the clone s finding the code belonging to a cer- CCFinder allows clones to start and end with little Theme: t 1. Elisa Baniassad and Siobhan Clarke. regard Example ore, in our algorithm to select the clone syntactic units. In contrast, Bauhaus’ ccdiml does notand An approach for aspect-oriented analysis allo 5), we favor coverage and sacrifice pre- design. In Proc. Int’l Conf. Software Engineer- this, due to its AST-based clone detection algorithm. DC, ing (ICSE), pages 158–167, Washington, ). Arguably, other goals require differ- USA, 2004. IEEE Computer Society Press. he clone classes. For example, in order 2. Elisa Baniassad, Paul C. Clements, Joao M C if (r != OK) Araujo, Ana Moreira, Awais Rashid, and Bedir ities for (automatic) refactoring, preci- MC{ Tekinerdogan. Discovering early aspects. IEEE MC ERXA_LOG(r, 0, (quot;PLXAmem_malloc failure.quot;)); • 3 issue. detection techniques M C ERXA_LOG(VSXA_MEMORY_ERR, r, and Linda Northrop. rimaryclone We plan to explore these Software, 23(1):61–70, January-February 2006. ture. 3. Len Bass, Mark Klein, MC • 5 known aspects detectors MC (quot;%s: failed to allocated %d bytes.quot;, Identifying aspects using architectural reason- ate to what extent the clone M func_name, toread)); ing. Position paper presented at Early Aspects vestigate the level of concern coverage M • 16KLOC C code is the fraction 2004: Aspect-Oriented Requirements Engineer- M r = VSXA_MEMORY_ERR; ing and Architecture Design, Workshop of the sses. Concern coverage M } 3rd Int’l Conf. Aspect-Oriented Software Devel- e • Aspects manually annotated code lines that are covered by the first opment (AOSD), 2004. Figure 3. CCFinder Engelen, and Arie van Deursen, evalua- 4. Magiel Bruntink, Remco by programmer clone covering memory erro sses. Using the selection algorithm de- van Tom Tourw´. An e handling. we obtain the results displayed in Fig- tion of clone detection techniques for identify- • for Bauhaus’ ccdiml and CCFinder, (b)Precision and recall compared ing crosscutting concerns. In Proc. Int’l Conf. to manual annotations Software Maintenance (ICSM), pages 200–209. Furthermore this IEEE Computerdoes not cover memory e clone class Society, 2004. evaluate the precision obtained by the 5. Magiel Bruntink, Arie van Deursen, Remco van ror handling code exclusively. Tom Figure On the note clone th Engelen, and In 2(d), use of that Technique: AS T Token Tourw´. P D Ge nique is relatively low. W hile as follows: classes. Precision is defined this low pre- precision obtained for the firstidentifying cross is roughly 82% detection for clone class cutting concern Concern: on is not a problemthis “ideal” case still Even for in se, it does imply that code. IEEE Computer Society Trans. Software Through inspection Engineering, .63we .81 Memory handling of the code .65 31(10):804–818, 2005. some of th found that ect mining techniques tend to return a lot concernLines(n) alse positives, which can be detrimental to -> n(n) = relatively poor precision , clones do not cover M. Ceccato, error handling code Moonen, bu Null pointer checking .99 .97 Marin, K. Mens, L. at all, 6. memory M. .80 totalLines(n) code that checking at the syntacticalTourw´. yetmining tech- Range is similar P. Tonella, and T. .42 e Applying and .71 .59 ir scalability and ease-of-use. Especially for level, semantical combining three different aspect E xception handling .38 .36 .35 eniquesnthat returnclone classes, concern- first selected a large number of results, different. niques. Software Quality Journal, 14(3):209– Tracing .62 .57 .68 lack of precision can be problematic, since 231, September 2006. mber of concern code lines covered by Table 1. A verage precision of K. Mens, and P. Tonella. A survey of 7. A. Kellens, each technique ay require an important amount of user in- 6.2. Parameter Checking clone classes, andthe false positives from ement to separate likewise totalLines for each of the five concerns automated code-level aspect mining techniques.12
  13. 13. Subjectivity and scalability Subjectivity in interpretation of results Filters, threshold values and blacklists configured by users Ambiguity in interpretation of what is valid aspect candidate “if it is part of the core functionality, it is not an aspect” e.g. “Moving Figures” in JHotDraw Scalability can be problematic due to user involvement often many results to be validated / refined by user looking for false positives / completing the aspect seeds 13
  14. 14. Evaluate, compare and combine Empirical validation no common benchmark subjectivity in interpretation results at different levels of detail and granularity Comparability how to compare the quality of mining techniques? Composability how to combine the results of different mining techniques? 14
  15. 15. Causes of the problems Inappropriate techniques Too general-purpose Too strong assumptions Too optimistic approaches Scattering versus tangling Lack of use of semantic information Imprecise definition of what is an aspect Inadequate representation of results 15
  16. 16. Aspect mining problems and causes Inadeq. Inappropriate techniques Imprecise Cause repres. of definition too general too strong too optimistic no attention lack of use of results purpose assumptions approaches to tangling sem. info Problem What can we learn from this table? - - - - - - poor precision - - - - poor recall - - - subjectivity Poor precision negatively affects scalability: more user involv. (-) (-) (-) (-) - (-) scalability - - emp. valid. - - comparability Most causes Only this one These three cause most problems negatively affect seems specific - - either precision, recall composability to aspects or both 16
  17. 17. How to improve? (1) Provide more rigourous definition of aspect Dedicated mining techniques may be more successful than general-purpose ‘one size fits all’ aspect mining techniques Rely on semantics rather than on code structure need for stable semantic foundation Desired quality depends on purpose of mining what is it that you want to do with the mined information? initial understanding vs. migration towards aspects 17
  18. 18. How to improve? (2) Leave room for variability Look for counter-evidence Look for symptoms of tangling Choose adequate and uniform way of presenting the results enough detail but not too much Combine results of different techniques Provide common framework to compare and evaluate mining techniques 18
  19. 19. Conclusion Most encountered pitfalls not specific to “aspect mining” relevant to any discovery / reverse engineering process especially present in aspect mining due to relative immaturity of domain potential for cross-fertilisation? A word of warning If you want to use aspect mining, don’t apply tools blindly If you want to research aspect mining, still many research opportunities but also a high risk of failure 19