Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Natural Language Analysis - Mining Java Class Naming Conventions

669 views

Published on

Paper: Mining Java Class Naming Conventions

Authors: Simon Butler, Michel Wermelinger, Yijun Yu and Helen Sharp

Session: Research Track 4 - Natural Language Analysis

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Natural Language Analysis - Mining Java Class Naming Conventions

  1. 1. Mining Java Class Naming Conventions Simon Butler, Michel Wermelinger, Yijun Yu & Helen Sharp Centre for Research in Computing The Open University 27 September 2011 Centre for Research in Computing m.a.wermelinger@open.ac.ukButler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 1/7
  2. 2. Class Identifier Names Despite the importance of class identifier names AbstractCollection Set knowledge of their structure is limited adjective ∗ noun + approximation found to be AbstractSet useful, but not universal What other part-of-speech patterns are commonly used? How are component words EnumSet HashSet TreeSet repeated? How often? Are there project-specific naming conventions?Butler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 2/7
  3. 3. Distribution of Java Classes in Inheritance Categories 0.7 0.6 Proportion of inheritance categories per project 0.5 0.4 0.3 0.2 0.1 0.0 E0I0 E0I1 E0In E1I0 E1I1 E1InButler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 3/7
  4. 4. Part-of-Speech Patterns Relative frequency of most common PoS patterns noun + adjective + verb + noun + + adjective + noun noun + noun + E0 I 0 0.85 0.08 0.01 0.01 E0 I 1 0.73 0.15 0.02 0.02 E0 I n 0.75 0.15 0.03 0.01 E1 I 0 0.68 0.12 0.04 0.03 E1 I 1 0.70 0.15 0.04 0.02 E1 I n 0.75 0.14 0.04 0.02 4 basic patterns account for 90% of class identifier names 85% of E0 I0 class identifier names are composed of nouns The adjective ∗ noun + approximation includes 85% of class identifier namesButler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 4/7
  5. 5. Component Word Inheritance Relative frequency distribution of name inheritance Super Class Name Interface Name Category All Fragment All Fragment Both E0 I1 - - 0.39 0.37 - E0 In - - 0.38 0.40 - E1 I0 0.23 0.58 - - - E1 I1 0.14 0.53 0.24 0.21 0.27 E1 In 0.11 0.50 0.15 0.25 0.18 Fragments of super class name most commonly repeated Most common patterns: E0 I1 & E0 I1 : noun + interface name , noun + interface fragment E1 I0 : noun + super class fragment , noun + super class name E1 I1 & E1 In : noun + super class fragment , interface name super class fragment , noun + super class nameButler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 5/7
  6. 6. Case Study - Freemind 652 class identifier names 53 (8%) with uncommon PoS patterns Each class inspected with questions: 1. Is the class identifier name a clear description of the class? 2. Can the class identifier name be refactored to a more common PoS pattern? 3. Can the class be refactored into classes that could be more conventionally named? We found: Class identifier names describing GUI actions initiated by the user, e.g. SelectAllAction ( verb determiner noun ) Class identifier names that conform to local naming conventions 7 class identifier names were candidates for name refactoring 1 class was a candidate for refactoringButler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 6/7
  7. 7. Conclusions Contributions Identification of common PoS structures found in praxis Identification of common patterns of component word repetition Unconventional class names: may conform to local naming conventions may be candidates for refactoring may indicate smells Practical Applications Recovery of class naming conventions Identification of unconventionally named classes Class identifier name recommendation systemsButler et al. (The Open University) Mining Java Class Naming Conventions 27 September 2011 7/7

×