Msr17a.ppt

Software Evolution and Defects from a Controlled,
Multiple, Industrial Case Study
Aiko Yamashita, S. Amirhossein Abtahizadeh, Foutse Khomh, Yann-Gaël Guéhéneuc

Centrum Wiskunde & Informatica

Oslo and Akershus University College of Applied Sciences

Polytechnique Montréal
Data Showcase - MSR 2017 - Buenos Aires, Argentina

Moderator Factors in Software Engineering
Researcher
Software Project

Moderator Factors in Software Engineering
Researcher
Software Project
System
Project context
Tasks
Source
code
Daily interviews
Audio files/notes
Subversion
database
Programming
Skill
Defects*
Development
Technology
Change
Size**
Effort**
Maintenance outcomes
Eclipse
activity
logs
Trac (Issue tracker),
Acceptance test
reports
Open interviews
Audio files/notes
Variables
of interest
Data
sources
Moderator
variables
Code smells
(num. smells**
smell density**)
Maintainability
perception*
Maintenance
problems**
Think aloud
Video files/notes
Study
diary
Learning Effect

• Simula Experiment• Software Replicability• 4 Norwegian firms
Java Applications with near same functionality
A DB C
Study 1
Task and learning effect

•Simula multiple case study
• Software Maintainability
• 2 European firms
Study 2

Task 3.
New Reporting
functionality
Task 1. Replacing external data source
✔
Task 2.
New authentication
mechanism
System!
Control over task•Simula multiple case study
Study 2

DCBA
Developer
System
Control over learning effect
Task 3.
New Reporting
functionality
Task 1. Replacing external data source
✔
Task 2.
New authentication
mechanism
System!
Control over task•Simula multiple case study
Study 2

Programming skills
“Construction and Validation of an Instrument

for Measuring Programming Skill”

(Bergersen et. al. 2014)
Control over programming skills

Programming skills
• Measurement instrument based
on combination of speed and
correctness.



Programming skills
correctness.
• The Rasch measurement model
was used.



Programming skills
correctness.
was used.
• Sixty-ﬁve professional developers
from eight countries participated
in validating the instrument



Programming skills
correctness.
was used.
• They solved 19 Java
programming tasks over two days



Programming skills
correctness.
was used.
• They solved 19 Java
programming tasks over two days
• Six of the participants who
scored better than average skill
were selected



Variables and Data Sources
System
Project context
Tasks
Source
code
Daily interviews
Audio files/notes
Subversion
database
Programming
Skill
Defects*
Development
Technology
Change
Size**
Effort**
Think aloud
Video files/notes
Task
progress
sheets
Eclipse
activity
logs
Acceptance test
reports
Open interviews
Audio files/notes
Variables
of interest
Data
sources
Moderator
variables
Code smells
(num. smells**
smell density**)
** System and file level
* Only at system level
Maintainability
perception*
Maintenance
problems**
Think aloud
Video files/notes
Study
diary
Task
Dates+
Figure from [1]
[1] Yamashita, 2012: “Assessing the capability of code smells to support software maintainability
assessments: Empirical inquiry and methodological approach” PhD Thesis

Source Code**
A DB C
**Available at: opendata.soccerlab.polymtl.ca/git/users/root/projects

Source Code**
• Java, Javascript, SQL, HTML, XML.
A DB C

Source Code**
• Developed by 4 Norwegian companies based on same speciﬁcation
A DB C

Source Code**
• Developed by 4 Norwegian companies based on same speciﬁcation
• Result from experiment reported by Anda et al., (2008): “Variability and
Reproducibility in Software Engineering: A Study of Four Companies
that Developed the Same System”
A DB C

Code smells and evolution data**
**Available at https://zenodo.org/record/293719

Code Smells:

Code Smells:
• Tools for Code Smells: Borland Together and InCode

Code Smells:
• Code Smells: Detected Data Class, Data Clumps, Duplicated code in conditional
branches, Feature Envy, God (Large) Class, God (Long) Method, Misplaced Class,
Refused Bequest, Shotgun Surgery, Temporary variable used for several purposes,
Use of implementation instead of interface, and Interface Segregation Principle (ISP)
Violation

Code Smells:
Violation
• Files: InitialSmells.xls (1 version), FinalSmells.xls (12 versions)

Code Smells:
Violation
Code Evolution:

Code Smells:
Violation
Code Evolution:
• Tool for changes: Custom written code with SVNKit

Code Smells:
Violation
Code Evolution:
• Variables: Programmer, Revision No., Date, Full path, Filename, File extension, System,
Action Type (i.e. Added, Deleted, Modiﬁed, Renamed), No. lines added, No. lines
deleted, No. lines changed, and Churn

Code Smells:
Violation
Code Evolution:
• Variables: Programmer, Revision No., Date, Full path, Filename, File extension, System,
Action Type (i.e. Added, Deleted, Modiﬁed, Renamed), No. lines added, No. lines
deleted, No. lines changed, and Churn
• File: Changes.xls (includes evolution of all 12 versions)

Software Evolution History**
DCBA
Developer
System

• 3 projects per system, i.e., 6 developers x 2 systems =
12 projects (cases or evolution histories)
DCBA
Developer
System

• Technologies involved: MySQL, Apache Tomcat, SVN,
Trac, My Eclipse
DCBA
Developer
System

Trac, My Eclipse
• Each project took 3-4 weeks, full-time.
DCBA
Developer
System

Trac, My Eclipse
• Each project took 3-4 weeks, full-time.
• SVN was converted to Git and hosted at Polytechnic of
Montreal.
DCBA
Developer
System

Defect Data**
++original SVN repo and Trac instances are
available upon request

Defect Data**
• Due to heterogeneity of systems, no common unit testing suit is
available :(

Defect Data**
available :(
• 2 rounds of acceptance testing for each of the 12 projects

Defect Data**
available :(
• Defects were recorded in Trac after each acceptance testing

Defect Data**
available :(
• Trac was too tightly-integrated with SVN, therefore not possible to
install on a server
++

Defect Data**
available :(
install on a server
++
• 12 reports extracted from Trac:

Defect Data**
available :(
install on a server
++
Defects_Dev{1/2/3/4/5/6}_Sys{A/B/C/D}.xlsx

Defect Data**
available :(
install on a server
++
Defects_Dev{1/2/3/4/5/6}_Sys{A/B/C/D}.xlsx
•

Task Dates**

Task Dates**
A problem in longitudinal, brown-ﬁeld study: limits between tasks become “blurry”

Task Dates**
Examples:

Task Dates**
Examples:
Developer ﬁnishes Task 3 in System 1 in the morning, and moves on to
Task 1 for System 2 in the afternoon.

Task Dates**
Examples:
Developer was working on Task 2, but then forgot to change something in
Task 1, so switch temporary between tasks.

Task Dates**
Examples:
We used different sources to estimate the Dates in which a developer was
working on a given System and a given Task.

Task Dates**
Examples:
Project context
Daily interviews
Audio files/notes
Subversion
database
Defects*
Development
Technology
Change
Size**
Effort**
Think aloud
Video files/notes
Task
progress
sheets
Eclipse
activity
logs
Acceptance test
reports
Open interviews
Audio files/notes
Maintainability
perception*
Maintenance
problems**
oud
/notes
Study
diary

Task Dates**
Examples:
Project context
Daily interviews
Audio files/notes
Subversion
database
Defects*
Development
Technology
Change
Size**
Effort**
Think aloud
Video files/notes
Task
progress
sheets
Eclipse
activity
logs
Acceptance test
reports
Open interviews
Audio files/notes
Maintainability
perception*
Maintenance
problems**
oud
/notes
Study
diary
Task
Dates

Potential usage scenarios
a) Analysis of “repeated defects” in a
multiple case study

multiple case study
b) Studies on the impact of diﬀerent
metrics/attributes on software evolution

multiple case study
c) Further studies on inter-smell relations

multiple case study
d) Cost-beneﬁt analysis of code smell
removal

multiple case study
removal
e) Benchmarking of diverse tools/
methodologies

multiple case study
removal
e) Benchmarking of diverse tools/
methodologies
f) Task/context extraction, alongside
ideas by [2]
[2] M. Barnett, et al., “Helping Developers Help Themselves: Automatic Decomposition
of Code Review Change-sets,” (ICSE ’15)

What to consider when using the data..

A. Context of the study

B. Tasks were individual

C. Time frame is approx. 1-2 sprints

D. The age of the systems (+10 years)

E. Tool for code smells not available

F. No explicit corrective tasks

G. Date accuracy for the tasks

H. Not all the commit logs were associated with an issue ID

H. Not all the commit logs were associated with an issue ID
I. Consider the trade-off between the degree of realism and the
degree of control in such type of studies

Trade-off between
realism and control
Sample size (Big Data)DataRichness(ThickData)
Low High
Low
High

Trade-off between
realism and control
Low High
Low
High
Controlled/Lab
Experiments

Trade-off between
realism and control
Low High
Low
High
Case studies
Controlled/Lab
Experiments

Trade-off between
realism and control
Low High
Low
High
Case studies
Controlled/Lab
Experiments
Ethnography

Trade-off between
realism and control
Low High
Low
High
Case studies
Repository
Analysis (OSS)
Controlled/Lab
Experiments
Ethnography

Trade-off between
realism and control
Low High
Low
High
Case studies
Repository
Analysis (OSS)
Controlled/Lab
Experiments
Our study?
Ethnography

Trade-off between
realism and control
Low High
Low
High
Case studies
Repository
Analysis (OSS)
Controlled/Lab
Experiments
Our study?
Mega-cross-project
experiments?
Ethnography

Experimental Replication Applied to Case Study [1]

Context Context
Case 1 Case 2
Literal Replication
≈
Same Tasks
Developers with similar skills
Same project setting
Same technology
Case 2
Code
Smells
System A
Code
Smells
System A
≈
Maintenance
outcomes
Maintenance
outcomes
System ASystem A
Same Systems
Context Context
Case 1 Case 2
Maintenance
outcomes
Theoretical Replication
≠
Same Tasks
Same technology
Case 3
Code
Smells
System A
Code
Smells
System B
≠
Maintenance
outcomes
System BSystem A
Different Systems

Context Context
Case 1 Case 2
Literal Replication
≈
Same Tasks
Same technology
Case 2
Code
Smells
System A
Code
Smells
System A
≈
Maintenance
outcomes
Maintenance
outcomes
System ASystem A
Same Systems
Context Context
Case 1 Case 2
Maintenance
outcomes
Theoretical Replication
≠
Same Tasks
Same technology
Case 3
Code
Smells
System A
Code
Smells
System B
≠
Maintenance
outcomes
System BSystem A
Different Systems
[1] Yamashita, 2012: “Assessing the capability of code smells to support software maintainability
assessments: Empirical inquiry and methodological approach” PhD Thesis

Msr17a.ppt

More Related Content

What's hot

Similar to Msr17a.ppt

More from Yann-Gaël Guéhéneuc

Recently uploaded

Msr17a.ppt