Your SlideShare is downloading. ×
Late Propagation in Software Clones
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Late Propagation in Software Clones

185
views

Published on

Published in: Technology, Business

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
185
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Late Propagation in Software ClonesLiliane Barbour, Foutse Khomh, and Ying Zou
  • 2. Late Propagation (LP)• Definition: An inconsistent change that diverges a clone pair, later followed by a consistent, re- synchronizing change.• It can be risky because failure to propagate changes between clones in a clone pair can lead to faults• In our work, we found that 8-21% of genealogies contain a late propagation 2
  • 3. LP With Propagation Example from ArgoUML//Clone A, Revision 595add Field(new UMLComboBox(typeModel),1,0,0);//Clone B, Revision 595add Field(new UMLComboBox(classifierModel),2,0,0);//Diverging Change: Clone A, Revision 602add Field(new UMLComboBoxNavigator(this,”NavClass”, new UMLComboBox(typeModel)),1,0,0);//Re-synchronizing Change: Clone B, Revision 604add Field(new UMLComboBoxNavigator (this,”NavClass”, new UMLComboBox(classifierModel)),2,0,0); Clone A Clone B Revision 595 Revision 602 Diverging Change Re-synchronizing Revision 604 Change 3
  • 4. LP Without Propagation Example from Ant//Clone A, Revision 270250 Clone A Clone Bif( destFile == null ){ Revision destFile = new File(destDir,file.getName()); 270250}//Clone B, Revision 270250 Revision Divergingif (destFile == null ) { 270264 Change destFile = new File(destDir,file.getName());} Revision Re-synchronizing// Diverging Change: Clone A, Revision 270264 271109 Changeif ( m_destFile == null ){ m_destFile = new File(m_destDir,m_file.getName());}//Re-synchronizing Change: Clone A, Revision 271109if ( destFile == null ) { destFile = new File(destDir,file.getName());} 4
  • 5. Types of Late PropagationPropagation LP Modified During Modified During Modified DuringCategory Type Diverging Change the Period of Re-synchronizing Divergence ChangePropagation LP1 A A BAlways Occurs LP2 A A and B B LP3 A A A and BPropagation May LP4 A A and B Aor May Not LP5 A A and B A and BOccur LP6 A and B A and B A or B LP7 A and B A and B A and BPropagation LP8 A A ANever Occurs 5
  • 6. Research QuestionsRQ1: Are there different types of LP?RQ2: Are some types of LP more fault-prone than others?RQ3: Which type of LP experiences the highest proportion of faults? 6
  • 7. Subject Systems # Gen # LP # Gen # LPSystem # LOC # Revisions CCFinder CCFinder Simian SimianArgoUML 3.1M 18k 14k 1.1k 111 23 Ant 2.3M 1.0M 30k 4.7k 461 80 7
  • 8. Our Approach 8
  • 9. Mining the SVN• Use J-Rex to mine the SVN• Heuristics used to identify reason for commit (Mockus et al., 2000)• Snapshots of all revisions to each Java file are stored in an XML file• Test files are removed 9
  • 10. Clone Detection• Contents of each method revision extracted into individual files• Perform clone detection once on all snapshots• Two existing clone detection tools are used – Simian (text-based) and CCFinder (token-based) 10
  • 11. Building Clone Genealogies• Build clone genealogies using the existing clone list• Query the SVN using diff to track changes to each clone in a clone pair over time.• If a change modifies one of the clones in a clone pair, query the clone list for a matching clone 11
  • 12. RQ1: Are there different types of LP? 12
  • 13. RQ1: Are there different types of LP? Breakdown of LP Type by System 80%Percentage of All LP Occurrences 70% 60% 50% 40% 30% 20% 10% 0% LP1 LP2 LP3 LP4 LP5 LP6 LP7 LP8 LP Types ArgoUML - Simian ArgoUML - CCFinder Ant - Simian Ant - CCFinder There is representation from multiple types of LP and across all categories of LP. 13
  • 14. RQ2: Are some types of LP more fault- prone than others? Part 1: Is Late Propagation fault-prone? Part 2: Are specific types of late propagation more fault-prone? 14
  • 15. Part 1: Is Late Propagation Fault- prone? LP vs. Non-LP Odds Ratios 4 ArgoUML – Simian Odds Ratio 3 is omitted because 2 it is not statistically 1 significant 0 Ant - Simian ArgoUML - CCFinder Ant - CCFinderIn all significant cases, the odds ratio is greater than 1. Therefore, LP genealogies are more fault prone than non-LP genealogies. 15
  • 16. Part 2: Are specific types of late propagation more fault-prone? Odds Ratios Between Each LP Type and Non-LP Genealogies 16 14 12 Odds Ratio 10 8 6 4 2 0 LP1 LP2 LP3 LP4 LP5 LP6 LP7 LP8 LP Type Ant - Simian ArgoUML - CCFinder Ant - CCFinderNote: ArgoUML – Simian is omitted because it is not statistically significant 16
  • 17. RQ2 Observations• In general, some LP types are not more fault-prone than non-LP genealogies (i.e. odds ratio < 1)• Some types that make up a small proportion of LP instances have a very high odds ratio• LP7 and LP8 occur frequently but have low odds ratios.Each type of LP has a different level of fault-proneness. 17
  • 18. RQ3: Which type of LP experiences the highest proportion of faults? 18
  • 19. RQ3: Which type of LP experiences the highest proportion of faults? Percentage of Fault Occurrences Broken Down by LP Type Percentage of Fault Occurrences 80% 60% 40% 20% 0% LP1 LP2 LP3 LP4 LP5 LP6 LP7 LP8 LP Type Ant - Simian ArgoUML - CCFinder Ant - CCFinderNote: ArgoUML – Simian is omitted because it is not statistically significant 19
  • 20. RQ3 Observations• LP7 and LP8 contribute a large proportion of the faults but have lower odds ratios (RQ2) – When faults occur, they occur in large numbers• Overall, LP7 and LP8 are the most dangerous, with the other types being system dependent in their fault-proneness. The proportion of faults is different for each LP type. 20
  • 21. Conclusion• In general, LP genealogies are more fault-prone than non-LP genealogies• LP7 and LP8 are the riskiest, in terms of their fault- proneness and magnitude of faults. – LP8 contains no propagation of changes – LP7 may or may not contain any propagation of changes• The fault-proneness and fault-occurrence is dependent on the LP type and is system-dependent. 21
  • 22. 22

×