Recommending Refactorings based 
on Team Co-Maintenance Patterns 
Gabriele Sebastiano Nikolaos Massimiliano Rocco Gerardo 
Bavota Panichella Tsantalis Di Penta Oliveto Canfora 
9/18/2014
Outline 
Context and Motivations 
- Software Development 
- Team based Refactoring (TBR) for restructuring source code 
Case Study 
- User Study: releases of the Android APIs 
Results 
- Evaluation of TBR and comparison with the state of the art 
9/18/2014
Refactoring is… 
9/18/2014 
‘‘…a disciplined technique for restructuring an 
existing body of code, altering its internal 
structure without changing its external 
behavior’’. [Fowler 1999]
Refactoring Sources of Information 
- Refactoring operation need to capturing relation between code 
components. 
- Various are the explored sources of information: 
- Structural (Static and Dynamic); 
- Semantic; 
- Historical. 
9/18/2014
Structural Information 
- calls between methods, shared attributes, call relationships 
9/18/2014 
occurring during program execution. 
Pros: 
- precise information; 
- easy to capture; 
- always available. 
Cons: 
- may be imprecise; 
- may miss some kinds 
of dependencies.
Semantic Information 
9/18/2014
Semantic Information 
- Textual similarity between code components. 
9/18/2014
Semantic Information 
9/18/2014 
Pros: 
- easy to capture; 
- always available. 
Cons: 
- Assumption that 
terms are consistently 
used in the code 
- Textual similarity between code components.
Historical Information 
9/18/2014 
Fabio Palomba et al. in ‘’Detecting bad smells in source code using 
change history information.’’’ - ASE 2013.
Historical Information 
9/18/2014 
Fabio Palomba et al. in ‘’Detecting bad smells in source code using 
change history information.’’’ - ASE 2013.
Historical Information 
9/18/2014 
Pros: 
- precise information. 
Cons: 
- too strong to capture. 
Fabio Palomba et al. in ‘’Detecting bad smells in source code using 
change history information.’’’ - ASE 2013.
Are we Missing other Kinds of Dependencies? 
9/18/2014
Are we Missing other Kinds of Dependencies? 
Software development is a very human 
intensive activity…. 
9/18/2014
Social Dimension 
9/18/2014 
Time interval 
considered
Social Dimension 
9/18/2014 
Time interval 
considered
Team-based Refactoring Opportunity 
9/18/2014
Team-based Refactoring Opportunity 
9/18/2014 
Extract Class
Our Contribution 
Team Based Refactoring (TBR): 
Information derived from teams to identify 
refactoring opportunities. 
9/18/2014
Our Contribution 
9/18/2014 
1) Teams identification 
2) Detection of Refactoring 
opportunities (e.g. extract 
class refactoring) 
Team Based Refactoring (TBR): 
Information derived from teams to identify 
refactoring opportunities.
Teams Identification 
9/18/2014 
Time
Teams Identification 
9/18/2014 
Time 
Time window 
considered
Teams Identification 
9/18/2014 
Time 
Time window 
considered 
Class 1 
Class 3 
Class 2 
Class 4
Teams Identification 
9/18/2014 
Time 
Time window 
considered 
Class 1 
Class 3 
Class 2 
Class 4
Teams Identification 
9/18/2014 
Time 
Time window 
considered 
Class 1 
Class 3 
Class 2 
Class 4
Teams Identification 
9/18/2014 
Time 
Time window 
considered 
Class 1 
Class 3 
Class 2 
Class 4
Teams Identification 
9/18/2014 
Time 
Time window 
considered 
Class 1 
Class 3 
Class 2 
Class 4 
Ward's hierarchical clustering
Detection of Refactoring Opportunities 
9/18/2014 
Time window 
considered 
1) Existence of a set of methods owned by a Team 
Class 1 
Class 3 
Class 2 
Class 4
Detection of Refactoring Opportunities 
9/18/2014 
Time window 
considered 
1) Existence of a set of methods owned by a Team 
2) Splitting classes with many responsabilities 
Class 1 
Class 2 
Class 4 
Class 2 –a) 
Class 2 –b) 
Class 3
Detection of Refactoring Opportunities 
9/18/2014 
Time window 
considered 
1) Existence of a set of methods owned by a Team 
2) Splitting classes with many responsabilities 
Class 2 –a) 
Class 2 –b) 
Class 1 
Class 3 
Class 2 
Class 4
Case Study 
Goal: evaluate and compare the quality of the refactoring 
solutions identified by TBR with approaches based on more 
traditional sources. 
Research questions: 
• RQ1: Is the information derived from teams useful to 
identify refactoring opportunities? 
• RQ2: Is the information derived from teams 
complementary to the sources of information typically 
exploited to identify refactoring opportunities? 
9/18/2014
Context 
• Objects: 
Project from Andr. Api Period KLOC 
framework-opt-telephony Aug 2011-Jan 2013 73-78 
framework-base Oct 2008-Jan 2013 534-1,043 
framework-support Feb 2011-Nov 2012 58-61 
sdk Oct 2008-Jan 2013 14-82 
tool-base Nov 2012-Jan 2013 80-134 
• Subjects: 
9/18/2014 
2 PhD students 1 Industrial developer
Useful Refactoring 
solutions 
9/18/2014 
RQ1: Is the information derived from teams 
useful to identify refactoring opportunities? 
74% 
78% 
26% 
22% 
Medium/Low 
perceived effort 
NO YES
Useful Refactoring 
solutions 
9/18/2014 
RQ1: Is the information derived from teams 
useful to identify refactoring opportunities? 
74% 
78% 
26% 
22% 
Medium/Low 
perceived effort 
NO YES
Useful Refactoring 
solutions 
9/18/2014 
RQ1: Is the information derived from teams 
useful to identify refactoring opportunities? 
74% 
78% 
26% 
22% 
Medium/Low 
perceived effort 
NO YES
RQ2: TBR complementarity with other sources 
typically used for identifying refactoring opportunities 
Evaluation: MoJoFM
RQ2: TBR complementarity with other sources 
typically used for identifying refactoring opportunities 
Evaluation: MoJoFM 
1 
6 
2 3 
4 
1 
2 5 
4 
C1 C2 
MoJoFM = 201 move + 0 join = 021
RQ2: TBR complementarity with other sources 
typically used for identifying refactoring opportunities 
5 
Evaluation: MoJoFM 
1 
6 
3 
6 
3 
C1 C2 
2 
4 
1 
2 4 
MoJoFM = 2 3 move + 201 join = 234 
5
RQ2: TBR complementarity with other sources 
typically used for identifying refactoring opportunities 
5 
Evaluation: MoJoFM 
1 
6 
3 
5 
C1 C2 
2 
4 
1 
2 4 
MoJoFM = 3 move + 3 join = 65 
6 
3 
max(∀ mno(Ci,Cj) max(∀ mno(Ci,Cj)
RQ2: TBR complementarity with other sources 
typically used for identifying refactoring opportunities 
70% 
30% 
43% 
35% 
100% 
90% 
80% 
70% 
60% 
50% 
40% 
30% 
20% 
10% 
0% 
Structural U 
Semantic U 
Historical 
Structural Semantic Historical
RQ2: TBR complementarity with other sources 
typically used for identifying refactoring opportunities 
70% 
30% 
43% 
35% 
100% 
90% 
80% 
70% 
60% 
50% 
40% 
30% 
20% 
10% 
0% 
Structural U 
Semantic U 
Historical 
Structural Semantic Historical
41 
Conclusion 
9/18/2014
42 
Conclusion 
9/18/2014
43 
Conclusion 
9/18/2014
44 
Conclusion 
9/18/2014
45 
Conclusion 
9/18/2014

Recommending Refactorings based on Team Co-Maintenance Patterns

  • 1.
    Recommending Refactorings based on Team Co-Maintenance Patterns Gabriele Sebastiano Nikolaos Massimiliano Rocco Gerardo Bavota Panichella Tsantalis Di Penta Oliveto Canfora 9/18/2014
  • 2.
    Outline Context andMotivations - Software Development - Team based Refactoring (TBR) for restructuring source code Case Study - User Study: releases of the Android APIs Results - Evaluation of TBR and comparison with the state of the art 9/18/2014
  • 3.
    Refactoring is… 9/18/2014 ‘‘…a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior’’. [Fowler 1999]
  • 4.
    Refactoring Sources ofInformation - Refactoring operation need to capturing relation between code components. - Various are the explored sources of information: - Structural (Static and Dynamic); - Semantic; - Historical. 9/18/2014
  • 5.
    Structural Information -calls between methods, shared attributes, call relationships 9/18/2014 occurring during program execution. Pros: - precise information; - easy to capture; - always available. Cons: - may be imprecise; - may miss some kinds of dependencies.
  • 6.
  • 7.
    Semantic Information -Textual similarity between code components. 9/18/2014
  • 8.
    Semantic Information 9/18/2014 Pros: - easy to capture; - always available. Cons: - Assumption that terms are consistently used in the code - Textual similarity between code components.
  • 9.
    Historical Information 9/18/2014 Fabio Palomba et al. in ‘’Detecting bad smells in source code using change history information.’’’ - ASE 2013.
  • 10.
    Historical Information 9/18/2014 Fabio Palomba et al. in ‘’Detecting bad smells in source code using change history information.’’’ - ASE 2013.
  • 11.
    Historical Information 9/18/2014 Pros: - precise information. Cons: - too strong to capture. Fabio Palomba et al. in ‘’Detecting bad smells in source code using change history information.’’’ - ASE 2013.
  • 12.
    Are we Missingother Kinds of Dependencies? 9/18/2014
  • 13.
    Are we Missingother Kinds of Dependencies? Software development is a very human intensive activity…. 9/18/2014
  • 14.
    Social Dimension 9/18/2014 Time interval considered
  • 15.
    Social Dimension 9/18/2014 Time interval considered
  • 16.
  • 17.
    Team-based Refactoring Opportunity 9/18/2014 Extract Class
  • 18.
    Our Contribution TeamBased Refactoring (TBR): Information derived from teams to identify refactoring opportunities. 9/18/2014
  • 19.
    Our Contribution 9/18/2014 1) Teams identification 2) Detection of Refactoring opportunities (e.g. extract class refactoring) Team Based Refactoring (TBR): Information derived from teams to identify refactoring opportunities.
  • 20.
  • 21.
    Teams Identification 9/18/2014 Time Time window considered
  • 22.
    Teams Identification 9/18/2014 Time Time window considered Class 1 Class 3 Class 2 Class 4
  • 23.
    Teams Identification 9/18/2014 Time Time window considered Class 1 Class 3 Class 2 Class 4
  • 24.
    Teams Identification 9/18/2014 Time Time window considered Class 1 Class 3 Class 2 Class 4
  • 25.
    Teams Identification 9/18/2014 Time Time window considered Class 1 Class 3 Class 2 Class 4
  • 26.
    Teams Identification 9/18/2014 Time Time window considered Class 1 Class 3 Class 2 Class 4 Ward's hierarchical clustering
  • 27.
    Detection of RefactoringOpportunities 9/18/2014 Time window considered 1) Existence of a set of methods owned by a Team Class 1 Class 3 Class 2 Class 4
  • 28.
    Detection of RefactoringOpportunities 9/18/2014 Time window considered 1) Existence of a set of methods owned by a Team 2) Splitting classes with many responsabilities Class 1 Class 2 Class 4 Class 2 –a) Class 2 –b) Class 3
  • 29.
    Detection of RefactoringOpportunities 9/18/2014 Time window considered 1) Existence of a set of methods owned by a Team 2) Splitting classes with many responsabilities Class 2 –a) Class 2 –b) Class 1 Class 3 Class 2 Class 4
  • 30.
    Case Study Goal:evaluate and compare the quality of the refactoring solutions identified by TBR with approaches based on more traditional sources. Research questions: • RQ1: Is the information derived from teams useful to identify refactoring opportunities? • RQ2: Is the information derived from teams complementary to the sources of information typically exploited to identify refactoring opportunities? 9/18/2014
  • 31.
    Context • Objects: Project from Andr. Api Period KLOC framework-opt-telephony Aug 2011-Jan 2013 73-78 framework-base Oct 2008-Jan 2013 534-1,043 framework-support Feb 2011-Nov 2012 58-61 sdk Oct 2008-Jan 2013 14-82 tool-base Nov 2012-Jan 2013 80-134 • Subjects: 9/18/2014 2 PhD students 1 Industrial developer
  • 32.
    Useful Refactoring solutions 9/18/2014 RQ1: Is the information derived from teams useful to identify refactoring opportunities? 74% 78% 26% 22% Medium/Low perceived effort NO YES
  • 33.
    Useful Refactoring solutions 9/18/2014 RQ1: Is the information derived from teams useful to identify refactoring opportunities? 74% 78% 26% 22% Medium/Low perceived effort NO YES
  • 34.
    Useful Refactoring solutions 9/18/2014 RQ1: Is the information derived from teams useful to identify refactoring opportunities? 74% 78% 26% 22% Medium/Low perceived effort NO YES
  • 35.
    RQ2: TBR complementaritywith other sources typically used for identifying refactoring opportunities Evaluation: MoJoFM
  • 36.
    RQ2: TBR complementaritywith other sources typically used for identifying refactoring opportunities Evaluation: MoJoFM 1 6 2 3 4 1 2 5 4 C1 C2 MoJoFM = 201 move + 0 join = 021
  • 37.
    RQ2: TBR complementaritywith other sources typically used for identifying refactoring opportunities 5 Evaluation: MoJoFM 1 6 3 6 3 C1 C2 2 4 1 2 4 MoJoFM = 2 3 move + 201 join = 234 5
  • 38.
    RQ2: TBR complementaritywith other sources typically used for identifying refactoring opportunities 5 Evaluation: MoJoFM 1 6 3 5 C1 C2 2 4 1 2 4 MoJoFM = 3 move + 3 join = 65 6 3 max(∀ mno(Ci,Cj) max(∀ mno(Ci,Cj)
  • 39.
    RQ2: TBR complementaritywith other sources typically used for identifying refactoring opportunities 70% 30% 43% 35% 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Structural U Semantic U Historical Structural Semantic Historical
  • 40.
    RQ2: TBR complementaritywith other sources typically used for identifying refactoring opportunities 70% 30% 43% 35% 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Structural U Semantic U Historical Structural Semantic Historical
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.