Your SlideShare is downloading. ×
An Empirical Study Of Function Clones In Open Source Software
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

An Empirical Study Of Function Clones In Open Source Software

967
views

Published on

This a presentation on a Research paper basically they made a tool call NICAD.

This a presentation on a Research paper basically they made a tool call NICAD.

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
967
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
15
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. An Empirical Study of Function Clones in Open Source Software Chnchal K.Roy and James R. Cordy Queen’s University Presenter: MF Khan
  • 2. Outline
    • Introduction
    • NICAD Overview
    • Experimental Setup
    • Experimental Results
    • Conclusions
    • Discussion
  • 3. Introduction
    • Code Clone/Clone
      • Reusing a code of fragment by copying and pasting with or without minor modifications
    • Benefits
      • Software Maintenance (Bug detection)
    • History
      • Several techniques were proposed
      • Lack of in depth comparative studies on cloning in Variety of systems
  • 4. Introduction (Cont)
    • NICAD
      • In depth study of function cloning in 15+ C and Java Systems including Apache and Linux kernel
      • Accurate Detection of Near-Miss functions Clones.
      • Focusing on its worth in detecting copy/Pasted near-miss clones by using pretty printing, Code normalization and filtering
      • Light Weight using simple text line
      • Capable of detecting clones in very large system in different languages
  • 5. NICAD Overview
    • Three phases of clone detection
      • Extraction
      • All potential clones are identified and extracted.
      • All function and method in C & Java with their original source coordinates
      • Comparison ( Determination of Clones )
        • Potential clones are clustered and compared.
        • Pretty printed potential clones line by line text wise using Longest common subsequence(LCS).
  • 6. NICAD Overview
      • Unique Percentage of Items(UPI)
      • IF UPI for both line sequence is zero or below certain threshold.
      • Potential Clones are consider to be clone
      • Reporting
      • Results from NICAD reported in XML database form and interactive HTML
  • 7. Experimental Setup
      • Paper applied NICAD to find function clones in a number of open source systems
      • Later on paper introduce a set of metrics to analyze the results
  • 8. Experimental Setup
      • Subject Systems 10 C and 7 Java systems
  • 9. Clone Definition
    • Non empty functions of at least 3 LOC
    • In Pretty printed format.
    • Different Unique Percentage of Items (UPI) use to find exact and near miss clones.
    • E.g.
      • If UPI threshold is 0.0 =Exact clone
      • If UPI threshold is 0.10=Two function as clone
  • 10. Validation of Clones
    • To validate detected clone is 2 step process
    • 1:NICADE’s INTRACTIVE HTML OUTPUT
      • To given an overall view of original source of clone classes an over view of original source of clone classes.
    • 2:XML OUTPUT
      • To pair wise compare the original source of the functions in each clone class
      • using Linux diff to determine the textual similarity of the original source
  • 11. Metrics and Visualizations
    • Total Cloned Methods(TCM)
      • How to get over all cloning statistics
    • File Associated with Clone(FAWC)
      • Overall localization of clones.
      • From a s/w maintenance point of view, a lower value of FAWCP is desirable...Why?
      • If clone are localized to certain specific files and thus may be easier to maintain
      • Still one can’t say which files contain the majority of clone in the system
  • 12. Metrics and Visualizations
    • Cloned Ratio of File for Methods(CRFM)
      • With CRFM we attempt discover highly cloned files
      • In a particular file (f)
    • Profile of Cloning Locality w.r.t Methods(PCLM)
      • Kapser and Godfrey provide 3 location base function clones.
      • 1:In the same File 2:Same DIR 3: Different DIR
  • 13. Experimental Results 1.More function cloning in Open Source java than in C. On AvG about 15%(7.2% wrt LOC) 2.Effect of increasing UPI is almost identical.
  • 14. Detail Overview 1.Several of C system have <10% cloning function. Java systems are consistent in cloning
  • 15. Clone Associated Files
  • 16. Clone Associated Files
    • FAWC address the issue of what portion of the files in a system is associated with clone.
    • A system with more clones but with associated with only a few files is in some sense better than a system with fewer clones scattered over many files from a software maintenance point of view.
  • 17. Profiles of Cloning Density
    • It tell us which files are highly cloned or which files contain the majority of clones
    That’s mean Scattered File and more near miss clones
  • 18. Profile of cloning Density Assuming that cloned method in high density cloned file have been intentionally copy/Pasted.
  • 19. Profile Cloning Localization Location of a clone pair is a factor in s/w maintenance Except Linux there are no exact clone in (UPI threshold 0.0) in C When UPI threshold is 0.3,On average 45.9 %(49.0 % LOC) of clone pair in C Occur.
  • 20. Conclusion
    • NICAD is capable of accurately finding the
    • Exact Function Clone
    • Near Miss Function Clones
  • 21. Discussion
    • What is definition of Clone?
    • What is definition of near-miss clones?
    • Why Wel tab is higher in slide 14?
    • What if we use C++ or C#?
    • What will happen if we use smaller clone granularity such as begin- end block
  • 22. Thank you.

×