4. Clone detection tools detect
C&P after it is performed
Source Code
Clone
Detection
4
Code Clones
5. There exists no large scale C&P
study on developers
5
Controlled ExperimentSmall number Experienced only
6. Larger scale study exists on
regular users
• Regular computer users
• Non-Software development tasks
A large scale C&P study is needed for software
development tasks
6
7. Eclipse Usage Data Collector
(UDC) enables a large scale
C&P study
20 Months
7
>1 Millions Users
8. How to detect C&P in Eclipse
UDC
User ID What Kind … Description
104526 Executed Command org.eclipse.ui.edit.copy
User performs Copy
8
9. How to detect C&P in Eclipse
UDC
User ID What Kind … Description
104526 Executed Command org.eclipse.ui.edit.copy
104526 Executed Command org.eclipse.ui.edit.paste
User performs Paste
9
10. Our study focuses on users
who frequently and actively use
Eclipse
Create
Development
Sessions
10
Find Active
Sessions
Find Frequent
Users
13. Average number of C&P per hour is
different from recent studies
13
2.73 16
Our finding
Previous
finding
#Commands > Average #Commands
+ 1 Standard deviation
#Commands > Average #Commands
+ 2 Standard deviation
Heavy
Editing
Sessions
V. Heavy
Editing
Sessions
11.39 13.18
14. Do IDE users follow the
same C&P patterns as
regular users?
How do IDE users copy
and paste code across
different file formats?
14
15. Do IDE users follow the
same C&P patterns as
regular users?
How do IDE users copy
and paste code across
different file formats?
15
18. IDE users perform consecutive
C&P
A
18
Repeat
Copy
CCopy
B
Paste
D Paste
19. IDE users perform consecutive
C&P
A
B
C
19
Distribution
Copy
Paste
Paste
20. IDE users perform consecutive
C&P
A B C
20
Relay
Copy
CopyPaste
Paste
21. IDE users often perform relay
on C&P
A
C
B
D
A
B
C
A B C
21
Repeat Distribution
Relay
22. C&P behavior of IDE users is
different from regular users
IDE Users
Higher Within
Higher Relay
Lower Distribution
Regular Users
Higher Between
Lower Relay
Higher Distribution
Eclipse IDE requires tailored C&P support tools that differ
from regular users’ C&P tools
22
23. Do IDE users follow the
same C&P patterns as
regular users?
How do IDE users copy
and paste code across
different file formats?
23
There are major differences
between C&P behavior of Eclipse
IDE users and C&P behavior of
regular users.
24. Do IDE users follow the
same C&P patterns as
regular users?
How do IDE users copy
and paste code across
different file formats?
24
There are major differences
between C&P behavior of Eclipse
IDE users and C&P behavior of
regular users.
There exists large number of C&P
between editors, hence, clone
detection techniques would
consider detect clones across
different languages.