Successfully reported this slideshow.
Your SlideShare is downloading. ×

Using Compilation/Decompilation to Enhance Clone Detection

More Related Content

Related Books

Free with a 30 day trial from Scribd

See all

Using Compilation/Decompilation to Enhance Clone Detection

  1. 1. IWSC ‘17 CREST, University College London, UK Using Compilation/Decompilation to Enhance Clone Detection Chaiyong Ragkhitwetsagul, Jens Krinke
  2. 2. Clone 
 det. Plag 
 det. Comp. Others ccfx deckard iclones nicad simian jplag-java jplag-text plaggie sherlock simjava simtext 7zncd-BZip2 7zncd-LZMA 7zncd-LZMA2 7zncd-Deflate 7zncd-Deflate64 7zncd-PPMd bzip2ncd gzipncd icd ncd-bzlib ncd-zlib xz-ncd bsdiff diff py-difflib py-fuzzywuzzy py-jellyfish py-ngram py-sklearn 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 F1 Orig. Dec. Ragkhitwetsagul et al., 2016
  3. 3. Clone 
 det. Plag 
 det. Comp. Others ccfx deckard iclones nicad simian jplag-java jplag-text plaggie sherlock simjava simtext 7zncd-BZip2 7zncd-LZMA 7zncd-LZMA2 7zncd-Deflate 7zncd-Deflate64 7zncd-PPMd bzip2ncd gzipncd icd ncd-bzlib ncd-zlib xz-ncd bsdiff diff py-difflib py-fuzzywuzzy py-jellyfish py-ngram py-sklearn 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 F1 Orig. Dec.
  4. 4. 4 O. Kononenko, C. Zhang, and M. W. Godfrey, ICSME ‘14 What Happens? Compiling Clones:
  5. 5. and Decompiling 4 What Happens? Compiling Clones:
  6. 6. Source Code and Decompiling 4 What Happens? Compiling Clones:
  7. 7. Existing tools Source Code and Decompiling 4 What Happens? Compiling Clones:
  8. 8. Missing Source Existing tools Source Code and Decompiling 4 What Happens? Compiling Clones:
  9. 9. C. Ragkhitwetsagul J. Krinke CREST, UCL, UK decomp. clones clone mapper decomp. & mapped clones compiler decompiler decompiled software clone detector original clones 5 software common clones disjoint clones manual investigation Experimental Framework
  10. 10. C. Ragkhitwetsagul J. Krinke CREST, UCL, UK 6 System Ver. Original Decompiled Files SLOC Files SLOC 4.1.3 203 9,777 311 11,233 1.5.0 644 96,711 669 85,251 9.0 1,688 241,924 2603 256,974 Apache Tomcat® Software Systems
  11. 11. 7 Tool Config. Parameters NiCad Type-1 UPI=0.0, renaming=none Type-2 UPI=0.0, renaming=consistent Type-3 UPI=0.3, renaming=consistent Tools javac Procyon NiCad Compiler Decompiler Clone Detector
  12. 12. 8 Clone Mapper decompiled clone report DCP1(dm1, dm2) decompiled clone pairs software m1 m2 m4 m3 mn … DCP2(dm1, dm3) DCP3(dm2, dm4) … DCPn(dmm, dmo) set of methods (M) mo decompiled-and-mapped clone report DCP*1((dm1,dm2),(m1,m2)) DCP*2((dm1,dm3),(m1,m3)) DCP*3((dm2,dm4),(m2,m4)) DCP*n((dmm,dmo),(mm,mo)) … decompiled-and-mapped clone pairs …
  13. 13. C. Ragkhitwetsagul J. Krinke CREST, UCL, UK 9 Common & Disjoint Clone Pairs Ccommon Corig-only Cdecomp-only Original Decompiled
  14. 14. C. Ragkhitwetsagul J. Krinke CREST, UCL, UK 10 Results
  15. 15. 11 JUnit Original Decompiled Type-1 Type-2 Type-3 6 3
  16. 16. 12 JFreeChart Original Decompiled Type-1 Type-2 Type-3 159 155 33 15 48 1 17 27 3
  17. 17. 13 Tomcat Original Decompiled Type-1 Type-2 Type-3 217 608 20 25 141 22 3 23 1
  18. 18. C. Ragkhitwetsagul J. Krinke CREST, UCL, UK 14 Manual Investigation
  19. 19. 15 No.ofclonepairs 0 10 20 30 40 50 Type-1 Type-2 Type-3 47 15 1 48 15 1 Candidates TP JFreeChart Cforig-only No.ofclonepairs 0 6 12 18 24 30 Type-1 Type-2 Type-3 27 17 3 27 17 3 Cfdecomp-only
  20. 20. 16 No.ofclonepairs 0 32 64 96 128 160 Type-1 Type-2 Type-3 141 2522 141 2522 Candidates TP Tomcat Cforig-only No.ofclonepairs 0 6 12 18 24 30 Type-1 Type-2 Type-3 23 31 23 31 Cfdecomp-only
  21. 21. Clone set Reasons Cforig-only Too small after decomp. Too diff. after decomp. Smaller after decomp. higher dissimilarity Unknown Cfdecomp-only Having deleted/added stmt., type cast, package name. Different if-else statements Different loop statements Inner class methods Unknown Characteristics of Disjoint Clones
  22. 22. Clone set Reasons Cforig-only Too small after decomp. Too diff. after decomp. Smaller after decomp. higher dissimilarity Unknown Cfdecomp-only Having deleted/added stmt., type cast, package name. Different if-else statements Different loop statements Inner class methods Unknown JFreeChart 0 10 20 30 40 50 5 11 32 6 9 T1 T2 T3 0 2 4 6 8 10 12 14 16 12 4 3 8 12 53
  23. 23. Clone set Reasons Cforig-only Too small after decomp. Too diff. after decomp. Smaller after decomp. higher dissimilarity Unknown Cfdecomp-only Having deleted/added stmt., type cast, package name. Different if-else statements Different loop statements Inner class methods Unknown Tomcat 0 28 56 84 112 140 16 5 120 19 6 T1 T2 T3 0 5 10 15 20 25 30 20 2 1 3 2
  24. 24. @Override public Range findRangeBounds(XYDataset dataset) { if (dataset != null) { Range r = DatasetUtilities.findRangeBounds(dataset, false); if (r == null) { return null; } else { return new Range(r.getLowerBound() + this.yOffset, r.getUpperBound() + this.blockHeight + this.yOffset); } } else { return null; } } @Override public Range findDomainBounds(XYDataset dataset) { if (dataset == null) { return null; } Range r = DatasetUtilities.findDomainBounds(dataset, false); if (r == null) { return null; } return new Range(r.getLowerBound() + this.xOffset, r.getUpperBound() + this.blockWidth + this.xOffset); } O R I G I N A L
  25. 25. @Override public Range findRangeBounds(XYDataset dataset) { if (dataset != null) { Range r = DatasetUtilities.findRangeBounds(dataset, false); if (r == null) { return null; } else { return new Range(r.getLowerBound() + this.yOffset, r.getUpperBound() + this.blockHeight + this.yOffset); } } else { return null; } } @Override public Range findDomainBounds(XYDataset dataset) { if (dataset == null) { return null; } Range r = DatasetUtilities.findDomainBounds(dataset, false); if (r == null) { return null; } return new Range(r.getLowerBound() + this.xOffset, r.getUpperBound() + this.blockWidth + this.xOffset); } O R I G I N A L
  26. 26. @Override public Range findDomainBounds(final XYDataset dataset) { if (dataset == null) { return null; } final Range r = DatasetUtilities.findDomainBounds(dataset, false); if (r == null) { return null; } return new Range(r.getLowerBound() + this.xOffset, r.getUpperBound() + this.blockWidth + this.xOffset); } @Override public Range findRangeBounds(final XYDataset dataset) { if (dataset == null) { return null; } final Range r = DatasetUtilities.findRangeBounds(dataset, false); if (r == null) { return null; } return new Range(r.getLowerBound() + this.yOffset, r.getUpperBound() + this.blockHeight + this.yOffset); } D E C O M P I L E D
  27. 27. public void clearRangeMarkers() { if (this.backgroundRangeMarkers != null) { Set<Integer> keys = this.backgroundRangeMarkers.keySet(); for (Integer key : keys) { clearRangeMarkers(key); } this.backgroundRangeMarkers.clear(); } if (this.foregroundRangeMarkers != null) { Set<Integer> keys = this.foregroundRangeMarkers.keySet(); for (Integer key : keys) { clearRangeMarkers(key); } this.foregroundRangeMarkers.clear(); } fireChangeEvent(); } public void clearRangeMarkers() { if (this.backgroundRangeMarkers != null) { Set keys = this.backgroundRangeMarkers.keySet(); Iterator iterator = keys.iterator(); while (iterator.hasNext()) { Integer key = (Integer) iterator.next(); clearRangeMarkers(key.intValue()); } this.backgroundRangeMarkers.clear(); } if (this.foregroundRangeMarkers != null) { Set keys = this.foregroundRangeMarkers.keySet(); Iterator iterator = keys.iterator(); while (iterator.hasNext()) { Integer key = (Integer) iterator.next(); clearRangeMarkers(key.intValue()); } this.foregroundRangeMarkers.clear(); } fireChangeEvent(); } ORIGINAL
  28. 28. public void clearRangeMarkers() { if (this.backgroundRangeMarkers != null) { Set<Integer> keys = this.backgroundRangeMarkers.keySet(); for (Integer key : keys) { clearRangeMarkers(key); } this.backgroundRangeMarkers.clear(); } if (this.foregroundRangeMarkers != null) { Set<Integer> keys = this.foregroundRangeMarkers.keySet(); for (Integer key : keys) { clearRangeMarkers(key); } this.foregroundRangeMarkers.clear(); } fireChangeEvent(); } public void clearRangeMarkers() { if (this.backgroundRangeMarkers != null) { Set keys = this.backgroundRangeMarkers.keySet(); Iterator iterator = keys.iterator(); while (iterator.hasNext()) { Integer key = (Integer) iterator.next(); clearRangeMarkers(key.intValue()); } this.backgroundRangeMarkers.clear(); } if (this.foregroundRangeMarkers != null) { Set keys = this.foregroundRangeMarkers.keySet(); Iterator iterator = keys.iterator(); while (iterator.hasNext()) { Integer key = (Integer) iterator.next(); clearRangeMarkers(key.intValue()); } this.foregroundRangeMarkers.clear(); } fireChangeEvent(); } ORIGINAL
  29. 29. public void clearRangeMarkers() { if (this.backgroundDomainMarkers != null) { final Set<Integer> keys = this.backgroundDomainMarkers.keySet(); for (final Integer key : keys) { this.clearDomainMarkers(key); } this.backgroundDomainMarkers.clear(); } if (this.foregroundDomainMarkers != null) { final Set<Integer> keys = this.foregroundDomainMarkers.keySet(); for (final Integer key : keys) { this.clearDomainMarkers(key); } this.foregroundDomainMarkers.clear(); } this.fireChangeEvent(); } public void clearRangeMarkers() { if (this.backgroundRangeMarkers != null) { final Set keys = this.backgroundRangeMarkers.keySet(); for (final Integer key : keys) { this.clearRangeMarkers(key); } this.backgroundRangeMarkers.clear(); } if (this.foregroundRangeMarkers != null) { final Set keys = this.foregroundRangeMarkers.keySet(); for (final Integer key : keys) { this.clearRangeMarkers(key); } this.foregroundRangeMarkers.clear(); } this.fireChangeEvent(); } DECOMPILED
  30. 30. 26 Study on 3 real-world systems: JUnit, JFreeChart, Tomcat Using Compilation/Decompilation 
 to Enhance Clone Detection 1 Clone pairs before and after decompilation are mostly similar for all three clone types. Findings: 2 One can complement the original clone results by incorporating clones after decompilation. Characteristics of disjoint clones3 C. Ragkhitwetsagul, J. Krinke cragkhit.github.io/crjk-iwsc17

×