Recommending Refactoring Operations in Large 
Software SystemsPĂłsGraduaçãoemCiĂȘnciadaComputaçãoAluno:CarlosEduardoDantasOrientador:MarceloMaia
Recommending Refactoring Operations in Large Software SystemsIntroduction 
‱During its lifecycle, the internal structure of a Software system undergoes continuous modifications; 
‱Source code quality often decreases; 
‱Low quality is generally associated with lower productivity, more rework and more effort for developers.
Recommending Refactoring Operations in Large Software SystemsRefactoring 
‱Process of changing Software System to improve its internal structure; 
‱Help in removing codebad smells as well as antipatterns; 
‱Improve Architeture and provide better Software extensibility.
Recommending Refactoring Operations in Large Software SystemsChallenges on Refactoring code 
‱Design flaws are not always obvious; 
‱Its not easy to apply the correct refactoring operation to solve design problems; 
‱Apply the refactoring solution without changing the external behaviour of the system; 
‱Manual refactoring is an error-prone task.
Recommending Refactoring Operations in Large Software SystemsChallenges on Refactoring code 
‱Refactoring recommendation systems supports: 
Identifying refactoring opportunities; 
Designing and applying a refactoring solution.
Recommending Refactoring Operations in Large Software SystemsRefactoring operations catalog 
‱Some refactoring operations, classified on the benefits provided to the source code. 
Improving Code Decomposition 
Extract Class 
Extract Package 
Extract Method 
Improving names and location of code 
Rename method (field) 
Move method 
Move class 
Improving Conformance with OOP principles 
Push down field/method 
Pull up field/method 
Extract/collapse hierarchy
Recommending Refactoring Operations in Large Software SystemsAlgorithms exploited to identify refactoring recommendations 
RefactoringOperation 
Approach 
Extract Class 
Clustering-basedor Graph-based 
Move Method 
Heuristic-based or Search-based 
Extract Method 
Slicing-based 
Extract Package 
Graph-based 
Move Class 
Search-based or Heuristic-based 
Combination of multiple operations 
Search-based 
‱This presentation will discuss only Extract Class Refactoring approaches.
Recommending Refactoring Operations in Large Software SystemsExtract Class Refactoring 
‱In OOP design, classes are called “God Class”, “System Class” or “Blob Class” when become very large, less cohesive and exhibit high levels of coupling; 
‱Class should implement only one concept, having only one reason to change; 
‱Extract class split the responsabilities implemented on Blob classes into different classes with higher cohesion.
Recommending Refactoring Operations in Large Software SystemsExtract Class Refactoring with Clustering-Based algorithm 
‱Clustering methods can identify conceptually meaningful groups of similar entities; 
‱Clusters may represent cohesive groups of class members; 
‱Clusters compute an entity set for each attribute and invoke method.
Recommending Refactoring Operations in Large Software SystemsClustering-Based algorithm – extract entities
Recommending Refactoring Operations in Large Software SystemsClustering-Based algorithm – entity set 
a1 
name 
changeJob 
modifyName 
getTelephoneNumber 
a2 
job 
changeJob 
modifyName 
getTelephoneNumber 
a3 
officeAreaCode 
getTelephoneNumber 
a4 
officeNumber 
getTelephoneNumber 
m1 
changeJob 
job 
name 
m2 
modifyName 
job 
name 
m3 
getTelephoneNumber 
job 
name 
officeAreaCode 
officeAreaNumber
Recommending Refactoring Operations in Large Software SystemsClustering-Based algorithm – Jaccard method 
‱Jaccard method calculate the distance between entities, generating distance matrix.
Recommending Refactoring Operations in Large Software SystemsClustering-Based algorithm – cluster deondrogram 
‱Cluster®s hierarchy is usually represented by a deondrogram.
Recommending Refactoring Operations in Large Software SystemsClustering-Based algorithm – Evaluation 
Software 
Total Suggestions 
Assigned Names 
Applied 
eRisk 
37 
28 
16 
SelfPlanner 
14 
12 
9
Recommending Refactoring Operations in Large Software SystemsGraph-Based algorithm – class extraction process 
‱The candidate class is parsed to build a method-by-method matrix; 
‱NxN matrix where n is the number of methods of class to be refatored; 
‱Each entry represents the likelihood that method mi and method mj should be in the same class; 
‱Identify chains of strongly related or coupled methods.
Recommending Refactoring Operations in Large Software SystemsGraph-Based algorithm – class extraction process
Recommending Refactoring Operations in Large Software SystemsGraph-Based algorithm Method-by-method matrix 
‱Structural Similarity between Methods; 
‱Call-based dependence between methods; 
‱Conceptual similarity between methods.
Recommending Refactoring Operations in Large Software SystemsGraph-Based algorithmMethod-by-method matrix
Recommending Refactoring Operations in Large Software SystemsGraph-Based algorithmIdentifying candidate chains 
Constant Threshold 
Variable Threshold
Recommending Refactoring Operations in Large Software SystemsGraph-Basedalgorithm – Evaluation 
‱Created an artificial scenario, where classes with high cohesion of five open source systems have been merged to build classes with low cohesion and many responsibilities
Recommending Refactoring Operations in Large Software SystemsGraph-Basedalgorithm – Evaluation 
‱Constant threshold (0.1,0.2,0.3,0.4)
Recommending Refactoring Operations in Large Software SystemsGraph-Basedalgorithm – Evaluation 
‱Variable threshold (Q1,Q2,Q3)
Recommending Refactoring Operations in Large Software SystemsGraph-Basedalgorithm – Evaluation 
‱Interaction between weight and variable threshold on JHotDraw merging 2 classes
Recommending Refactoring Operations in Large Software SystemsGraph-Basedalgorithm – Evaluation 
‱Interaction between weight and variable threshold on JHotDraw merging 3 classes
Recommending Refactoring Operations in Large Software SystemsGraph-Basedalgorithm – Evaluation 
‱Interaction between weight and system merging 2 classes
Recommending Refactoring Operations in Large Software SystemsGraph-Basedalgorithm – Evaluation 
‱Interaction between weight and system merging 3 classes
Recommending Refactoring Operations in Large Software SystemsReferences 
‱[1] Robillard, Martin P.:, Maalej, Wallid.: Recommendation Systems in Software Engineering. Springer, 2014; 
‱[2] Fokaefs, M., Tsantalis, N., Chatzigeorgiou, A., Sander, J.: Decomposing object-oriented class modules using an agglomerative clustering technique. In: Proceedings of the IEEE International Conference on Software Maintenance, (2009); 
‱[3] Fowler, M.: Refactoring: improving the design of existing code. Addison-Wesley, Reading, 1999;
Recommending Refactoring Operations in Large Software SystemsReferences 
‱[4] Fokaefs, M., Tsantalis, N., Stroulia, E., Chatzigeorgiou, A.: Identification and application of extract class refactoringsin object-oriented systems. J. Syst. Software 85(10), 2012; 
‱[5] Bavota, G., De Lucia, A., Marcus, A., Oliveto, R.: Automating extract class refactoring: a Novel Approach and its Evaluation, 2011; 
‱[6] Bavota, G., De Lucia, A., Marcus, A., Oliveto, R.: In Medio Stat Virtus: Extract Class Refactoring through Nash Equilibria, 2011.
Recommending Refactoring Operations in Large Software SystemsAppendix –Another Graph- Based algorithm example 
Method 
Instance variables 
changeJob 
{job,name} 
modifyName 
{job,name} 
getTelephoneNumber 
{job,name,officeAreaCode,officeNumber} 
CJ 
MN 
GT 
CJ 
1 
1 
0,5 
MN 
1 
1 
0,5 
GT 
0,5 
0,5 
1 
Method 
Callmethods 
changeJob 
{} 
modifyName 
{} 
getTelephoneNumber 
{} 
CJ 
MN 
GT 
CJ 
1 
1 
0 
MN 
0 
1 
0 
GT 
0 
0 
1 
Method 
Conceptual Similarity 
changeJob 
{} 
modifyName 
{} 
getTelephoneNumber 
{} 
CJ 
MN 
GT 
CJ 
1 
0.7 
0.1 
MN 
0.7 
1 
0.1 
GT 
0.1 
0.1 
1
Recommending Refactoring Operations in Large Software SystemsAppendix –Another Graph- Based algorithm example 
CJ 
MN 
GT 
CJ 
1 
0.7 
0.2 
MN 
0.7 
1 
0.2 
GT 
0.2 
0.2 
1

Recommending refactoring operations in large software systems

  • 1.
    Recommending Refactoring Operationsin Large Software SystemsPĂłsGraduaçãoemCiĂȘnciadaComputaçãoAluno:CarlosEduardoDantasOrientador:MarceloMaia
  • 2.
    Recommending Refactoring Operationsin Large Software SystemsIntroduction ‱During its lifecycle, the internal structure of a Software system undergoes continuous modifications; ‱Source code quality often decreases; ‱Low quality is generally associated with lower productivity, more rework and more effort for developers.
  • 3.
    Recommending Refactoring Operationsin Large Software SystemsRefactoring ‱Process of changing Software System to improve its internal structure; ‱Help in removing codebad smells as well as antipatterns; ‱Improve Architeture and provide better Software extensibility.
  • 4.
    Recommending Refactoring Operationsin Large Software SystemsChallenges on Refactoring code ‱Design flaws are not always obvious; ‱Its not easy to apply the correct refactoring operation to solve design problems; ‱Apply the refactoring solution without changing the external behaviour of the system; ‱Manual refactoring is an error-prone task.
  • 5.
    Recommending Refactoring Operationsin Large Software SystemsChallenges on Refactoring code ‱Refactoring recommendation systems supports: Identifying refactoring opportunities; Designing and applying a refactoring solution.
  • 6.
    Recommending Refactoring Operationsin Large Software SystemsRefactoring operations catalog ‱Some refactoring operations, classified on the benefits provided to the source code. Improving Code Decomposition Extract Class Extract Package Extract Method Improving names and location of code Rename method (field) Move method Move class Improving Conformance with OOP principles Push down field/method Pull up field/method Extract/collapse hierarchy
  • 7.
    Recommending Refactoring Operationsin Large Software SystemsAlgorithms exploited to identify refactoring recommendations RefactoringOperation Approach Extract Class Clustering-basedor Graph-based Move Method Heuristic-based or Search-based Extract Method Slicing-based Extract Package Graph-based Move Class Search-based or Heuristic-based Combination of multiple operations Search-based ‱This presentation will discuss only Extract Class Refactoring approaches.
  • 8.
    Recommending Refactoring Operationsin Large Software SystemsExtract Class Refactoring ‱In OOP design, classes are called “God Class”, “System Class” or “Blob Class” when become very large, less cohesive and exhibit high levels of coupling; ‱Class should implement only one concept, having only one reason to change; ‱Extract class split the responsabilities implemented on Blob classes into different classes with higher cohesion.
  • 9.
    Recommending Refactoring Operationsin Large Software SystemsExtract Class Refactoring with Clustering-Based algorithm ‱Clustering methods can identify conceptually meaningful groups of similar entities; ‱Clusters may represent cohesive groups of class members; ‱Clusters compute an entity set for each attribute and invoke method.
  • 10.
    Recommending Refactoring Operationsin Large Software SystemsClustering-Based algorithm – extract entities
  • 11.
    Recommending Refactoring Operationsin Large Software SystemsClustering-Based algorithm – entity set a1 name changeJob modifyName getTelephoneNumber a2 job changeJob modifyName getTelephoneNumber a3 officeAreaCode getTelephoneNumber a4 officeNumber getTelephoneNumber m1 changeJob job name m2 modifyName job name m3 getTelephoneNumber job name officeAreaCode officeAreaNumber
  • 12.
    Recommending Refactoring Operationsin Large Software SystemsClustering-Based algorithm – Jaccard method ‱Jaccard method calculate the distance between entities, generating distance matrix.
  • 13.
    Recommending Refactoring Operationsin Large Software SystemsClustering-Based algorithm – cluster deondrogram ‱Cluster®s hierarchy is usually represented by a deondrogram.
  • 14.
    Recommending Refactoring Operationsin Large Software SystemsClustering-Based algorithm – Evaluation Software Total Suggestions Assigned Names Applied eRisk 37 28 16 SelfPlanner 14 12 9
  • 15.
    Recommending Refactoring Operationsin Large Software SystemsGraph-Based algorithm – class extraction process ‱The candidate class is parsed to build a method-by-method matrix; ‱NxN matrix where n is the number of methods of class to be refatored; ‱Each entry represents the likelihood that method mi and method mj should be in the same class; ‱Identify chains of strongly related or coupled methods.
  • 16.
    Recommending Refactoring Operationsin Large Software SystemsGraph-Based algorithm – class extraction process
  • 17.
    Recommending Refactoring Operationsin Large Software SystemsGraph-Based algorithm Method-by-method matrix ‱Structural Similarity between Methods; ‱Call-based dependence between methods; ‱Conceptual similarity between methods.
  • 18.
    Recommending Refactoring Operationsin Large Software SystemsGraph-Based algorithmMethod-by-method matrix
  • 19.
    Recommending Refactoring Operationsin Large Software SystemsGraph-Based algorithmIdentifying candidate chains Constant Threshold Variable Threshold
  • 20.
    Recommending Refactoring Operationsin Large Software SystemsGraph-Basedalgorithm – Evaluation ‱Created an artificial scenario, where classes with high cohesion of five open source systems have been merged to build classes with low cohesion and many responsibilities
  • 21.
    Recommending Refactoring Operationsin Large Software SystemsGraph-Basedalgorithm – Evaluation ‱Constant threshold (0.1,0.2,0.3,0.4)
  • 22.
    Recommending Refactoring Operationsin Large Software SystemsGraph-Basedalgorithm – Evaluation ‱Variable threshold (Q1,Q2,Q3)
  • 23.
    Recommending Refactoring Operationsin Large Software SystemsGraph-Basedalgorithm – Evaluation ‱Interaction between weight and variable threshold on JHotDraw merging 2 classes
  • 24.
    Recommending Refactoring Operationsin Large Software SystemsGraph-Basedalgorithm – Evaluation ‱Interaction between weight and variable threshold on JHotDraw merging 3 classes
  • 25.
    Recommending Refactoring Operationsin Large Software SystemsGraph-Basedalgorithm – Evaluation ‱Interaction between weight and system merging 2 classes
  • 26.
    Recommending Refactoring Operationsin Large Software SystemsGraph-Basedalgorithm – Evaluation ‱Interaction between weight and system merging 3 classes
  • 27.
    Recommending Refactoring Operationsin Large Software SystemsReferences ‱[1] Robillard, Martin P.:, Maalej, Wallid.: Recommendation Systems in Software Engineering. Springer, 2014; ‱[2] Fokaefs, M., Tsantalis, N., Chatzigeorgiou, A., Sander, J.: Decomposing object-oriented class modules using an agglomerative clustering technique. In: Proceedings of the IEEE International Conference on Software Maintenance, (2009); ‱[3] Fowler, M.: Refactoring: improving the design of existing code. Addison-Wesley, Reading, 1999;
  • 28.
    Recommending Refactoring Operationsin Large Software SystemsReferences ‱[4] Fokaefs, M., Tsantalis, N., Stroulia, E., Chatzigeorgiou, A.: Identification and application of extract class refactoringsin object-oriented systems. J. Syst. Software 85(10), 2012; ‱[5] Bavota, G., De Lucia, A., Marcus, A., Oliveto, R.: Automating extract class refactoring: a Novel Approach and its Evaluation, 2011; ‱[6] Bavota, G., De Lucia, A., Marcus, A., Oliveto, R.: In Medio Stat Virtus: Extract Class Refactoring through Nash Equilibria, 2011.
  • 29.
    Recommending Refactoring Operationsin Large Software SystemsAppendix –Another Graph- Based algorithm example Method Instance variables changeJob {job,name} modifyName {job,name} getTelephoneNumber {job,name,officeAreaCode,officeNumber} CJ MN GT CJ 1 1 0,5 MN 1 1 0,5 GT 0,5 0,5 1 Method Callmethods changeJob {} modifyName {} getTelephoneNumber {} CJ MN GT CJ 1 1 0 MN 0 1 0 GT 0 0 1 Method Conceptual Similarity changeJob {} modifyName {} getTelephoneNumber {} CJ MN GT CJ 1 0.7 0.1 MN 0.7 1 0.1 GT 0.1 0.1 1
  • 30.
    Recommending Refactoring Operationsin Large Software SystemsAppendix –Another Graph- Based algorithm example CJ MN GT CJ 1 0.7 0.2 MN 0.7 1 0.2 GT 0.2 0.2 1