Uploaded on


More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Techniques for Detecting and Preventing Copy-and-Paste Errors during Software Development A Dissertation Proposal By Patricia Jablonski Engineering Science Clarkson University September 5, 2007
  • 2. Outline
    • Copying and pasting code
    • Modifying copy-and-pasted code
    • Our proposed solution (CnP)
    • Our proof of concept (CReN)
    • Demo of CReN
    • Related Eclipse features
    • Evaluation plan
    • Proposed plan
  • 3. Copying and Pasting Code
    • A common form of software reuse
      • Reuse copied code as a template
    • Why copy and paste code?
      • Duplicate code exactly
      • Defer creating an abstraction
      • Experiment and test
    • Results in code clones
      • Multiple similar code fragments
    • What happens when code needs modification?
  • 4. Modifying Copy-and-Pasted Code (1 of 2)
    • Expensive software maintenance
      • Original copied code could be erroneous
      • Changes need to be made to each instance
    • Solutions: clone detection and removal, clone tracking tools
      • Linked editing and simultaneous editing
        • Clones are selected and linked together so that modifications in one clone can be made to all of the clones that it is linked to simultaneously
  • 5. Modifying Copy-and-Pasted Code (2 of 2)
    • Manual modifications can result in undetected errors and unintended inconsistencies
    • Solution: error detection tools
      • CP-Miner tool
        • Uses identifier mapping, “forget-to-change” vs. “change”, and unchanged ratio
      • DECKARD-based tool
        • Uses a count of unique identifiers
    • What about proactive error prevention?
  • 6. Our Proposed Solution (CnP)
    • Provide automated tool support in the IDE
      • Eclipse, Java
    • Improve software quality during development
    • What are the main features of the CnP tool?
      • Tracks & highlights copy-pasted statements
      • Detects inconsistencies based on inferences of the programmer’s intention
        • Inconsistencies are based on inferred rules
    • What is the current status of CnP?
  • 7. Our Proof of Concept (CReN) Design and Implementation (1 of 5)
    • Consistent renaming usage pattern
      • Identifier (for example, variable name) renaming within a copy-and-paste clone
    • Manual renaming can result in inconsistencies
    • What are the main features of the CReN tool?
      • Tracks & highlights copy-pasted statements
      • Automatically renames all instances of an identifier in a group when any one instance in the group is modified, the inferred rules can be refined by the programmer
  • 8.  
  • 9. Our Proof of Concept (CReN) Design and Implementation (2 of 5)
    • Tracking copy-and-paste clones
      • No clone detection tool or manual selection
      • Clone region: Java file name + clone’s range
    • Obtaining ASTs from clone locations
      • Abstract syntax tree (AST) API in Eclipse
      • AST captures the source code characters & their absolute position in the source code
      • Each ASTNode has starting/ending positions denoting character positions within the node
  • 10.  
  • 11. Our Proof of Concept (CReN) Design and Implementation (3 of 5)
    • Matching identifiers between clones
      • Determine relationships of identifiers between copy-and-pasted code fragments
      • Identifiers in the copied code are matched with their corresponding identifiers in the pasted code
      • When the code has just been pasted, its contents are identical to the copied fragment, only at a different location
      • Rules are inferred across all clones
  • 12.  
  • 13. Our Proof of Concept (CReN) Design and Implementation (4 of 5)
    • Partitioning identifiers into groups
      • Determine relationships of identifiers within copy-and-pasted code fragments
      • Identifiers in the copied and pasted code are partitioned into groups and mapped to each other
      • Defines the group of identifiers that are to be renamed together
      • Want group of identifiers that resolve to the same variable – use binding, if available
  • 14.  
  • 15. Our Proof of Concept (CReN) Design and Implementation (5 of 5)
    • Refining the inferred rules
      • When the code is initially pasted, the inferred rule assumes that all identifiers that would resolve to the same program entity should be renamed consistently
      • Programmer can choose to exclude the currently renamed identifier from the group (this instance is deleted from the vector)
      • The updated rule is inferred across all clones
    • Let’s see if CReN can detect/prevent errors...
  • 16. Our Proof of Concept (CReN) Usage and Demonstration
    • Three examples from literature show an inconsistent renaming of identifiers within a copy-and-pasted clone in production code
    • Z. Li, S. Lu, S. Myagmar, and Y. Zhou, “CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code”, USENIX-ACM SIGOPS Symposium on Operating Systems Design and Implementation (OSDI) , 2004.
    • B. Liblit, A. Aiken, A.X. Zheng, and M.I. Jordan, “Bug Isolation via Remote Program Sampling”, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) , 2003.
    • L. Jiang, Z. Su, and E. Chiu, “Context-Based Detection of Clone-Related Bugs”, European Software Engineering Conference (ESEC) and ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE) , 2007.
  • 17.  
  • 18.  
  • 19. Demo of CReN
    • Demonstrate how CReN would catch each identifier renaming error in the examples as if they were currently being written
    • (Some) CReN future work
      • Consistent renaming of any kind of identifier
      • Allow “undo” of taking identifier out of group
      • Consistent renaming in a user-defined scope
      • Apply renaming across all related clones
    • How are other Eclipse features related to CReN?
  • 20. Related Eclipse Features
    • Find & Replace
      • Text-based search, manually started
      • Not limited to within a code fragment
    • Rename Refactoring
      • Automatically applies to the whole project
      • Binding is important for it to work
    • Linked Renaming
      • Like Rename Refactoring, but applies to file
    • What are our next steps in our research?
  • 21. Evaluation Plan
    • We tested CReN with the three examples
    • We plan to perform controlled experiments
      • Give a homework assignment to students
      • Require them to use Eclipse & CnP plug-in
      • Have them write a suitable application
    • We plan to evaluate in terms of:
      • Usefulness, usability (user error), user experience, accuracy (false negatives & false positives), performance
    • What is our plan after CReN is fully evaluated?
  • 22. Proposed Plan
    • Determine usage patterns by using clone detection tools
    • What other kinds of errors could CnP handle?
      • Lexical/naming pattern inconsistencies
        • Substring is the same on both sides of =
        • Naming pairs like left/right, top/bottom
      • Type inconsistencies
        • Inferences can be made about types at the same positions across clones
    • Improve the mgmt and visualization of clones
  • 23. Conclusion
    • Copy-and-paste will remain a common programming practice, which can result in undetected errors
    • Error detection and prevention should happen during software development, not only “after-the-fact”
    • So far, we have implemented one of three parts of the proposed CnP tool, called CReN
      • Automatic tracking of copy-and-paste clones
      • Consistent renaming of identifiers within copy-and-paste clones
  • 24. Questions / Comments
  • 25. Extra Slides (CReN Demo Screen Shots)
  • 26.  
  • 27.