Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Thesis Talk


Published on

  • Be the first to comment

  • Be the first to like this

Thesis Talk

  1. 1. Using Method Similarity <br />over Versions<br /> to Improve Predictions based on Change History<br />BhavyaRawal<br />
  2. 2. The Eclipse Project <br />Open source toolkit for designing toolkits<br />25 releases so far<br />419 packages<br />24 million LOC <br />Over 380 committers<br />8 years of development<br />Distributed team<br />2<br />
  3. 3. Bug Fixing in Eclipse<br />Bug #230 – Interface field correction doesn&apos;t offer to create the field <br />3<br />
  4. 4. Learning from History <br />Change history based recommender approaches exploit<br />rich project history [Ying et al., Zimmerman et al., etc.]<br />What is a recommendation? <br />Programmers who changed also changed…<br />Example change pattern in Eclipse:<br />{,}<br />Both classes part of solution for Bug #230<br />4<br />
  5. 5. Ying et al.’s CHB approach <br />CVS<br />Change Sets<br /> {,, build.xml, …}<br /> {, …, build.xml…}<br /> {,, …}<br /> {,,}<br />………………………………..<br />………<br />Frequent<br /> Pattern<br /> Mining<br />Frequent Pattern:<br />{,, build.xml}<br />Changing<br />Programmers who changed also changed and build.xml<br />5<br />
  6. 6. Shortcomings of Ying et al.’s Approach<br />CVS<br /> Change Sets<br /> {,, build.xml, …}<br /> {, …, build.xml…}<br /> {,, …}<br /> {,,}<br />………………………………..<br />………<br />Frequent<br /> Pattern<br /> Mining<br />Frequent Pattern:<br />{,, build.xml}<br />Changing <br /><br />Major Refactoring<br /><br />Sorry! <br />No recommendations for<br />6<br />
  7. 7. Shortcomings of Ying et al.’s Approach<br /><ul><li>Relies on name/location to identify an entity
  8. 8. Change in name/location results in loss of history
  9. 9. Transformedentities will not result in valid recommendations</li></ul>A transformation is a set of operations performed on or using p software entities in a given version, resulting in q software entities in the successive version.<br />7<br />
  10. 10. Proposed Solution<br />Extending CHB<br />Detect transformations<br />Use original entity’s history for recomm.<br />Compare CHB against Extended CHB (ECHB)<br />We use Ying et al.’s approach as Baseline CHB<br />Test CHB and ECHB for different granularities<br />File-level<br />Method-level<br />8<br />
  11. 11. Recommender Systems Compared<br />Ying et al.’s CHB approach<br /><ul><li> method-level
  12. 12. file-level</li></ul>Similarity based (SB) approach<br /><ul><li> method-level
  13. 13. class-level</li></ul>Extended CHB (ECHB) approach <br /><ul><li> method-level
  14. 14. file-level</li></ul>9<br />
  15. 15. Evaluation of the 6 Approaches<br />Generate Frequent Patterns<br /> Select 20 modification tasks<br /> to test quality of approaches<br />10<br />
  16. 16. Evaluation of the Approaches<br />CHB, SB, ECHB evaluated on real tasks<br />Technique<br />Pick modification task and identify solution set<br />For each entity in solution set <br />Use entity as input to obtain recommendations<br />Compare recommendations against solution set<br />Repeat for other modification tasks<br />11<br />
  17. 17. Measuring the quality of recommendations?<br />R<br />S<br />All entities in a system<br />Correct recommendations<br />Evaluated the 3 approaches on Precision and Recalland Throughput<br />Precision =|R ∩ S| / |R|<br />Recall=|R∩ S| / |S|<br />Throughput=|Inputs resulting in recomm.| <br /> |Total Inputs|<br />Correctly returned<br /> recommendations<br />Returned recommendations<br />12<br />
  18. 18. Replicating Ying et al.’s work (Difficulties)<br /><ul><li>Ying et al.’s tool not available publically
  19. 19. Difficulties in recreating their work
  20. 20. Problems in finding solution set,
  21. 21. counting the number of recommendations
  22. 22. technique for finding average precision and recall </li></ul>13<br />
  23. 23. Results: CHB File-level v/s Method-level<br />Higher precision for method-level CHB<br />Higher recall for file-levelCHB<br />Higher throughput for file-levelCHB<br />14<br />
  24. 24. Extending CHB<br />How do we detect Transformations? <br />Transformed entities share facts with its parent entity.<br />Method Facts: Name, Return Type, Parameters, Callers, Callees<br />public ProgressStatusgetProgressUpdate<br />(boolean complete, IProgressMonitor monitor) <br />public ProgressStatusgetBriefProgressUpdate<br />(IProgressMonitor monitor) <br />15<br />Version n-1<br />Version n<br />
  25. 25. AB<br />Extending CHB<br />Transformed entities share facts with its parent entity.<br />Method Facts: Name, Return Type, Parameters, Callers, Callees<br />Caller<br />A<br />Transformed<br />Callee<br />B<br />Version n<br />Version n-1<br />16<br />
  26. 26. DetectingTransformations over 2 Versions<br />Two pass approach to detect transformations<br />Eliminate unchanged methods <br />Compare remaining methods <br />Method-pair Similarity based on individual fact-similarity<br />Name similarity, Caller Similarity etc.<br />Individual fact similarity<br />Param(m1): {int, Str} Param(m2): {int, Str, bool}<br /> Parameter Similarity = 2/3 = 0.67<br />Facts(m1) ∩ Facts(m2)<br />Facts(m1) ∪ Facts(m2)<br />17<br />
  27. 27. SB Approach <br /><ul><li>Transformation detection algorithm as a recommender system
  28. 28. Detected similar methods form recomm.
  29. 29. File-level SB approach
  30. 30. Detect similar classes using method-pair similarity
  31. 31. Detected similar classes form recomm.
  32. 32. Experiments use versions 2.0 and 2.1 of Eclipse</li></ul>18<br />
  33. 33. ECHB Approach<br />19<br />
  34. 34. ECHB Approach Results <br /><ul><li>Two variations
  35. 35. Best Match(BM) and Threshold(TH)
  36. 36. BM variation provides better results
  37. 37. ECHB approach works better for method-level</li></ul>20<br />
  38. 38. ECHB Selected Results<br />Method-level BM Variation<br />Selected inputs based on transformation cond.<br />Input exists in Version 2.1 but not in Version 2.0 <br />ECHB versus CHB Results<br />Throughput: 49% versus 14%<br />Precision: 9% versus 47%<br />Recall: 18% versus 13%<br />21<br />
  39. 39. Summary<br />CHB approaches do not take transformations into account<br />Including transformations can provide better recomm.<br />Recreated Ying et al.’s CHB approach<br />For file-level and method-level granularity <br />ECHB extends CHB by incorporating transformations<br />22<br />
  40. 40. Takeaway <br />Method-level ECHB approach provides valid recomm. in 35% more cases compared to CHB in the event of a potential transformation.<br />However, for a given input CHB provides significantly higher precision rates and slightly higher recall rates compared to ECHB.<br />23<br />