This document describes a controlled, multiple case study of software evolution and defects from industrial projects. It details the data sources used, including source code repositories, issue tracking databases, and interviews. Metrics such as code smells, size, effort, and defects were collected. Programming skills of developers were also measured. Code smell detection tools and custom scripts to analyze code changes were used to extract metrics on a variety of code issues and evolution over time. The data is available online for further analysis.
What java developers (don’t) know about api compatibilityJens Dietrich
The results of a survey aimed to find out what Java developers know about source, binary and behavioural API compatibility. The questions are based on the deployment puzzlers set of questions.
A Taxonomy for Program Metamodels in Program Reverse EngineeringHironori Washizaki
Hironori Washizaki, Yann-Gael Gueheneuc, Foutse Khomh, “A Taxonomy for Program Metamodels in Program Reverse Engineering,” 32nd IEEE International Conference on Software Maintenance and Evolution (ICSME) (CORE Rank A), October 2-10, Raleigh, North Carolina, USA. (to appear) (acceptance rate 29%=37/127) http://www.washi.cs.waseda.ac.jp/
What java developers (don’t) know about api compatibilityJens Dietrich
The results of a survey aimed to find out what Java developers know about source, binary and behavioural API compatibility. The questions are based on the deployment puzzlers set of questions.
A Taxonomy for Program Metamodels in Program Reverse EngineeringHironori Washizaki
Hironori Washizaki, Yann-Gael Gueheneuc, Foutse Khomh, “A Taxonomy for Program Metamodels in Program Reverse Engineering,” 32nd IEEE International Conference on Software Maintenance and Evolution (ICSME) (CORE Rank A), October 2-10, Raleigh, North Carolina, USA. (to appear) (acceptance rate 29%=37/127) http://www.washi.cs.waseda.ac.jp/
Class Diagram Extraction from Textual Requirements Using NLP Techniquesiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Introduction to C++ : Object Oriented Technology, Advantages of OOP, Input- output in
C++, Tokens, Keywords, Identifiers, Data Types C++, Derives data types. The void data
type, Type Modifiers, Typecasting, Constant
A novel approach for clone group mappingijseajournal
Clone group mapping has a very important significance in the evolution of code clone. The topic modeling
techniques were applied into code clone firstly and a new clone group mapping method was proposed. The
method is very effective for not only Type-1 and Type-2 clone but also Type-3 clone .By making full use of
the source text and structure information, topic modeling techniques transform the mapping problem of
high-dimensional code space into a low-dimensional topic space, the goal of clone group mapping was
indirectly reached by mapping clone group topics. Experiments on four open source software show that the
recall and precision are up to 0.99, thus the method can effectively and accurately reach the goal of clone
group mapping.
Block Library Driven Translation Validation for DataFlow Models in Safety Cri...Marc Pantel
Presentation of the use of the Block Library domain specific language for the translation validation of automated generated code at the FMICS-AVOCS 2016 workshop in Pisa, Italia.
Open Problems in Automatically Refactoring Legacy Java Software to use New Fe...Raffi Khatchadourian
Java 8 is one of the largest upgrades to the popular language and framework in over a decade. In this talk, I will first overview several new, key features of Java 8 that can help make programs easier to read, write, and maintain, especially in regards to collections. These features include Lambda Expressions, the Stream API, and enhanced interfaces, many of which help bridge the gap between functional and imperative programming paradigms and allow for succinct concurrency implementations. Next, I will discuss several open issues related to automatically migrating (refactoring) legacy Java software to use such features correctly, efficiently, and as completely as possible. Solving these problems will help developers to maximally understand and adopt these new features thus improving their software.
An increasing number of researchers rely on computational methods to generate the results described in their publications. Research software created to this end is heterogeneous (e.g., scripts, libraries, packages, notebooks, etc.) and usually difficult to find, reuse, compare and understand due to its disconnected documentation (dispersed in manuals, readme files, web sites, and code comments) and a lack of structured metadata to describe it. In this talk I will describe the main challenges for finding, comparing and reusing research software, how structured metadata can help to address some of them, which are the best practices being proposed by the community; and current initiatives to aid their adoption by researchers within EOSC.
Impact: The talk addresses an important aspect of the EOSC infrastructure for quality research software by ensuring that software contributed to the EOSC ecosystem can be found, compared and reused by researchers. The talk also aims to address metadata quality of current research products, which is critical for successful adoption.
Presented at the EOSC symposium
Class Diagram Extraction from Textual Requirements Using NLP Techniquesiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Introduction to C++ : Object Oriented Technology, Advantages of OOP, Input- output in
C++, Tokens, Keywords, Identifiers, Data Types C++, Derives data types. The void data
type, Type Modifiers, Typecasting, Constant
A novel approach for clone group mappingijseajournal
Clone group mapping has a very important significance in the evolution of code clone. The topic modeling
techniques were applied into code clone firstly and a new clone group mapping method was proposed. The
method is very effective for not only Type-1 and Type-2 clone but also Type-3 clone .By making full use of
the source text and structure information, topic modeling techniques transform the mapping problem of
high-dimensional code space into a low-dimensional topic space, the goal of clone group mapping was
indirectly reached by mapping clone group topics. Experiments on four open source software show that the
recall and precision are up to 0.99, thus the method can effectively and accurately reach the goal of clone
group mapping.
Block Library Driven Translation Validation for DataFlow Models in Safety Cri...Marc Pantel
Presentation of the use of the Block Library domain specific language for the translation validation of automated generated code at the FMICS-AVOCS 2016 workshop in Pisa, Italia.
Open Problems in Automatically Refactoring Legacy Java Software to use New Fe...Raffi Khatchadourian
Java 8 is one of the largest upgrades to the popular language and framework in over a decade. In this talk, I will first overview several new, key features of Java 8 that can help make programs easier to read, write, and maintain, especially in regards to collections. These features include Lambda Expressions, the Stream API, and enhanced interfaces, many of which help bridge the gap between functional and imperative programming paradigms and allow for succinct concurrency implementations. Next, I will discuss several open issues related to automatically migrating (refactoring) legacy Java software to use such features correctly, efficiently, and as completely as possible. Solving these problems will help developers to maximally understand and adopt these new features thus improving their software.
An increasing number of researchers rely on computational methods to generate the results described in their publications. Research software created to this end is heterogeneous (e.g., scripts, libraries, packages, notebooks, etc.) and usually difficult to find, reuse, compare and understand due to its disconnected documentation (dispersed in manuals, readme files, web sites, and code comments) and a lack of structured metadata to describe it. In this talk I will describe the main challenges for finding, comparing and reusing research software, how structured metadata can help to address some of them, which are the best practices being proposed by the community; and current initiatives to aid their adoption by researchers within EOSC.
Impact: The talk addresses an important aspect of the EOSC infrastructure for quality research software by ensuring that software contributed to the EOSC ecosystem can be found, compared and reused by researchers. The talk also aims to address metadata quality of current research products, which is critical for successful adoption.
Presented at the EOSC symposium
GPCE16: Automatic Non-functional Testing of Code Generators FamiliesMohamed BOUSSAA
The intensive use of generative programming techniques provides an elegant engineering solution to deal with the heterogeneity of platforms and technological stacks. The use of domain-specific languages for example, leads to the creation of numerous code generators that automatically translate high-level system specifications into multi-target executable code. Producing correct and efficient code generator is complex and error-prone. Although software designers provide generally high-level test suites to verify the functional outcome of generated code, it remains challenging and tedious to verify the behavior of produced code in terms of non-functional properties. This paper describes a practical approach based on a runtime monitoring infrastructure to automatically check the potential inefficient code generators. This infrastructure, based on system containers as execution platforms, allows code-generator developers to evaluate the generated code performance. We evaluate our approach by analyzing the performance of Haxe, a popular high-level programming language that involves a set of cross-platform code generators. Experimental results show that our approach is able to detect some performance inconsistencies that reveal real issues in Haxe code generators.
Unlocking Engineering Observability with advanced IT analyticssource{d}
In this webinar, source{d} CEO Eiso Kant will introduce source{d} Enterprise Edition (EE), the data platform for the software development life cycle (SDLC), With built-in visualization, management capabilities and advanced analytic functions, source{d} EE provide IT executives with visibility into their software portfolio, engineering processes and workforce.
Learn how source{d} EE can help everyone in the IT organization to quickly get access to customizable analytic solutions for IT modernization and software compliance, cloud-native and DevOps transformation, engineering effectiveness, and talent management.
The field of machine programming — the automation of the development of software — is making notable research advances. This is, in part, due to the emergence of a wide range of novel techniques in machine learning. In today’s technological landscape, software is integrated into almost everything we do, but maintaining software is a time-consuming and error-prone process. When fully realized, machine programming will enable everyone to express their creativity and develop their own software without writing a single line of code. Intel realizes the pioneering promise of machine programming, which is why it created the Machine Programming Research (MPR) team in Intel Labs. The MPR team’s goal is to create a society where everyone can create software, but machines will handle the “programming” part.
How do organizations build secure applications, given today's rapidly moving and evolving DevOps practices? Join Black Duck and our customer experts on best practices for application security in DevOps.
You’ll learn:
-New security challenges facing today’s popular DevOps and Continuous Integration (CI) practices, including managing custom code and open source risks with containers and traditional environments
-Best practices for designing and incorporating an automated approach to application security into your existing development environment
-Future development and application security challenges organizations will face and what they can do to prepare
May: Automated Developer Testing: Achievements and ChallengesTriTAUG
Developer testing, a common step in software development, involves generating sufficient test inputs and checking the behavior of the program under test during the execution of the test inputs. Complicated logics inside a method make generating appropriate arguments difficult. In testing object-oriented programs, generating method sequences to put the receiver object or argument objects into appropriate states further complicates test-input generation. After the generated test inputs are executed, program crashes or uncaught exceptions can be used to indicate program problems, especially robustness problems. However, some program problems such as producing wrong program outputs do not crash the program.
In this talk, the speaker will present an overview of achievements and challenges in improving automation in developer testing, especially on test-input generation (i.e., generating sufficient test inputs) and test oracles (i.e., checking the behavior of the program under test).
About the speaker:
Tao Xie is an Associate Professor in the Department of Computer Science of the College of Engineering at North Carolina State University. He received his Ph.D. in Computer Science from the University of Washington in 2005. Before that, he received an M.S. in Computer Science from the University of Washington in 2002, an M.S. in Computer Science from Peking University in 2000, and a B.S. in Computer Science from Fudan University in 1997. He worked as a visiting researcher at Microsoft Research Redmond and Microsoft Research Asia.
His research interests are in software engineering, focusing on automated software testing and mining software engineering data. He has published more than 100 research papers in refereed journals and conference proceedings in the area of software engineering. Besides doing research, he has contributed to understanding the software engineering research community.
He has served as the ACM SIGSOFT History Liaison in the SIGSOFT Executive Committee as well as serving in the ACM History Committee. He received a National Science Foundation Faculty Early Career Development (CAREER) Award in 2009. He received 2008, 2009, and 2010 IBM Faculty Awards and a 2008 IBM Jazz Innovation Award. He received 2010 North Carolina State University Sigma Xi Faculty Research Award. He received the ASE 2009 Best Paper Award and an ACM SIGSOFT Distinguished Paper Award. He was Program Co-Chair of 2009 IEEE International Conference on Software Maintenance (ICSM) and is Program Co-Chair of 2011 and 2012 International Working Conference on Mining Software Repositories (MSR).
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfJay Das
With the advent of artificial intelligence or AI tools, project management processes are undergoing a transformative shift. By using tools like ChatGPT, and Bard organizations can empower their leaders and managers to plan, execute, and monitor projects more effectively.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
How Recreation Management Software Can Streamline Your Operations.pptx
Msr17a.ppt
1. Software Evolution and Defects from a Controlled,
Multiple, Industrial Case Study
Aiko Yamashita, S. Amirhossein Abtahizadeh, Foutse Khomh, Yann-Gaël Guéhéneuc
Centrum Wiskunde & Informatica
Oslo and Akershus University College of Applied Sciences
Polytechnique Montréal
Data Showcase - MSR 2017 - Buenos Aires, Argentina
6. • Simula Experiment• Software Replicability• 4 Norwegian firms
Java Applications with near same functionality
A DB C
Study 1
Task and learning effect
7. Task and learning effect
•Simula multiple case study
• Software Maintainability
• 2 European firms
Study 2
8. Task and learning effect
Task 3.
New Reporting
functionality
Task 1. Replacing external data source
✔
Task 2.
New authentication
mechanism
System!
Control over task•Simula multiple case study
• Software Maintainability
• 2 European firms
Study 2
9. Task and learning effect
DCBA
Developer
System
Control over learning effect
Task 3.
New Reporting
functionality
Task 1. Replacing external data source
✔
Task 2.
New authentication
mechanism
System!
Control over task•Simula multiple case study
• Software Maintainability
• 2 European firms
Study 2
11. Programming skills
“Construction and Validation of an Instrument
for Measuring Programming Skill”
(Bergersen et. al. 2014)
Control over programming skills
12. Programming skills
• Measurement instrument based
on combination of speed and
correctness.
“Construction and Validation of an Instrument
for Measuring Programming Skill”
(Bergersen et. al. 2014)
Control over programming skills
13. Programming skills
• Measurement instrument based
on combination of speed and
correctness.
• The Rasch measurement model
was used.
“Construction and Validation of an Instrument
for Measuring Programming Skill”
(Bergersen et. al. 2014)
Control over programming skills
14. Programming skills
• Measurement instrument based
on combination of speed and
correctness.
• The Rasch measurement model
was used.
• Sixty-five professional developers
from eight countries participated
in validating the instrument
“Construction and Validation of an Instrument
for Measuring Programming Skill”
(Bergersen et. al. 2014)
Control over programming skills
15. Programming skills
• Measurement instrument based
on combination of speed and
correctness.
• The Rasch measurement model
was used.
• Sixty-five professional developers
from eight countries participated
in validating the instrument
• They solved 19 Java
programming tasks over two days
“Construction and Validation of an Instrument
for Measuring Programming Skill”
(Bergersen et. al. 2014)
Control over programming skills
16. Programming skills
• Measurement instrument based
on combination of speed and
correctness.
• The Rasch measurement model
was used.
• Sixty-five professional developers
from eight countries participated
in validating the instrument
• They solved 19 Java
programming tasks over two days
• Six of the participants who
scored better than average skill
were selected
“Construction and Validation of an Instrument
for Measuring Programming Skill”
(Bergersen et. al. 2014)
Control over programming skills
17. Variables and Data Sources
System
Project context
Tasks
Source
code
Daily interviews
Audio files/notes
Subversion
database
Programming
Skill
Defects*
Development
Technology
Change
Size**
Effort**
Maintenance outcomes
Think aloud
Video files/notes
Task
progress
sheets
Eclipse
activity
logs
Trac (Issue tracker),
Acceptance test
reports
Open interviews
Audio files/notes
Variables
of interest
Data
sources
Moderator
variables
Code smells
(num. smells**
smell density**)
** System and file level
* Only at system level
Maintainability
perception*
Maintenance
problems**
Think aloud
Video files/notes
Study
diary
Task
Dates+
Figure from [1]
[1] Yamashita, 2012: “Assessing the capability of code smells to support software maintainability
assessments: Empirical inquiry and methodological approach” PhD Thesis
18. Source Code**
Java Applications with near same functionality
A DB C
**Available at: opendata.soccerlab.polymtl.ca/git/users/root/projects
19. Source Code**
• Java, Javascript, SQL, HTML, XML.
Java Applications with near same functionality
A DB C
**Available at: opendata.soccerlab.polymtl.ca/git/users/root/projects
20. Source Code**
• Java, Javascript, SQL, HTML, XML.
• Developed by 4 Norwegian companies based on same specification
Java Applications with near same functionality
A DB C
**Available at: opendata.soccerlab.polymtl.ca/git/users/root/projects
21. Source Code**
• Java, Javascript, SQL, HTML, XML.
• Developed by 4 Norwegian companies based on same specification
• Result from experiment reported by Anda et al., (2008): “Variability and
Reproducibility in Software Engineering: A Study of Four Companies
that Developed the Same System”
Java Applications with near same functionality
A DB C
**Available at: opendata.soccerlab.polymtl.ca/git/users/root/projects
22. Code smells and evolution data**
**Available at https://zenodo.org/record/293719
23. Code smells and evolution data**
Code Smells:
**Available at https://zenodo.org/record/293719
24. Code smells and evolution data**
Code Smells:
• Tools for Code Smells: Borland Together and InCode
**Available at https://zenodo.org/record/293719
25. Code smells and evolution data**
Code Smells:
• Tools for Code Smells: Borland Together and InCode
• Code Smells: Detected Data Class, Data Clumps, Duplicated code in conditional
branches, Feature Envy, God (Large) Class, God (Long) Method, Misplaced Class,
Refused Bequest, Shotgun Surgery, Temporary variable used for several purposes,
Use of implementation instead of interface, and Interface Segregation Principle (ISP)
Violation
**Available at https://zenodo.org/record/293719
26. Code smells and evolution data**
Code Smells:
• Tools for Code Smells: Borland Together and InCode
• Code Smells: Detected Data Class, Data Clumps, Duplicated code in conditional
branches, Feature Envy, God (Large) Class, God (Long) Method, Misplaced Class,
Refused Bequest, Shotgun Surgery, Temporary variable used for several purposes,
Use of implementation instead of interface, and Interface Segregation Principle (ISP)
Violation
• Files: InitialSmells.xls (1 version), FinalSmells.xls (12 versions)
**Available at https://zenodo.org/record/293719
27. Code smells and evolution data**
Code Smells:
• Tools for Code Smells: Borland Together and InCode
• Code Smells: Detected Data Class, Data Clumps, Duplicated code in conditional
branches, Feature Envy, God (Large) Class, God (Long) Method, Misplaced Class,
Refused Bequest, Shotgun Surgery, Temporary variable used for several purposes,
Use of implementation instead of interface, and Interface Segregation Principle (ISP)
Violation
• Files: InitialSmells.xls (1 version), FinalSmells.xls (12 versions)
Code Evolution:
**Available at https://zenodo.org/record/293719
28. Code smells and evolution data**
Code Smells:
• Tools for Code Smells: Borland Together and InCode
• Code Smells: Detected Data Class, Data Clumps, Duplicated code in conditional
branches, Feature Envy, God (Large) Class, God (Long) Method, Misplaced Class,
Refused Bequest, Shotgun Surgery, Temporary variable used for several purposes,
Use of implementation instead of interface, and Interface Segregation Principle (ISP)
Violation
• Files: InitialSmells.xls (1 version), FinalSmells.xls (12 versions)
Code Evolution:
• Tool for changes: Custom written code with SVNKit
**Available at https://zenodo.org/record/293719
29. Code smells and evolution data**
Code Smells:
• Tools for Code Smells: Borland Together and InCode
• Code Smells: Detected Data Class, Data Clumps, Duplicated code in conditional
branches, Feature Envy, God (Large) Class, God (Long) Method, Misplaced Class,
Refused Bequest, Shotgun Surgery, Temporary variable used for several purposes,
Use of implementation instead of interface, and Interface Segregation Principle (ISP)
Violation
• Files: InitialSmells.xls (1 version), FinalSmells.xls (12 versions)
Code Evolution:
• Tool for changes: Custom written code with SVNKit
• Variables: Programmer, Revision No., Date, Full path, Filename, File extension, System,
Action Type (i.e. Added, Deleted, Modified, Renamed), No. lines added, No. lines
deleted, No. lines changed, and Churn
**Available at https://zenodo.org/record/293719
30. Code smells and evolution data**
Code Smells:
• Tools for Code Smells: Borland Together and InCode
• Code Smells: Detected Data Class, Data Clumps, Duplicated code in conditional
branches, Feature Envy, God (Large) Class, God (Long) Method, Misplaced Class,
Refused Bequest, Shotgun Surgery, Temporary variable used for several purposes,
Use of implementation instead of interface, and Interface Segregation Principle (ISP)
Violation
• Files: InitialSmells.xls (1 version), FinalSmells.xls (12 versions)
Code Evolution:
• Tool for changes: Custom written code with SVNKit
• Variables: Programmer, Revision No., Date, Full path, Filename, File extension, System,
Action Type (i.e. Added, Deleted, Modified, Renamed), No. lines added, No. lines
deleted, No. lines changed, and Churn
• File: Changes.xls (includes evolution of all 12 versions)
**Available at https://zenodo.org/record/293719
32. Software Evolution History**
• 3 projects per system, i.e., 6 developers x 2 systems =
12 projects (cases or evolution histories)
DCBA
Developer
System
**Available at: opendata.soccerlab.polymtl.ca/git/users/root/projects
33. Software Evolution History**
• 3 projects per system, i.e., 6 developers x 2 systems =
12 projects (cases or evolution histories)
• Technologies involved: MySQL, Apache Tomcat, SVN,
Trac, My Eclipse
DCBA
Developer
System
**Available at: opendata.soccerlab.polymtl.ca/git/users/root/projects
34. Software Evolution History**
• 3 projects per system, i.e., 6 developers x 2 systems =
12 projects (cases or evolution histories)
• Technologies involved: MySQL, Apache Tomcat, SVN,
Trac, My Eclipse
• Each project took 3-4 weeks, full-time.
DCBA
Developer
System
**Available at: opendata.soccerlab.polymtl.ca/git/users/root/projects
35. Software Evolution History**
• 3 projects per system, i.e., 6 developers x 2 systems =
12 projects (cases or evolution histories)
• Technologies involved: MySQL, Apache Tomcat, SVN,
Trac, My Eclipse
• Each project took 3-4 weeks, full-time.
• SVN was converted to Git and hosted at Polytechnic of
Montreal.
DCBA
Developer
System
**Available at: opendata.soccerlab.polymtl.ca/git/users/root/projects
36. Defect Data**
++original SVN repo and Trac instances are
available upon request
**Available at https://zenodo.org/record/293719
37. Defect Data**
• Due to heterogeneity of systems, no common unit testing suit is
available :(
++original SVN repo and Trac instances are
available upon request
**Available at https://zenodo.org/record/293719
38. Defect Data**
• Due to heterogeneity of systems, no common unit testing suit is
available :(
• 2 rounds of acceptance testing for each of the 12 projects
++original SVN repo and Trac instances are
available upon request
**Available at https://zenodo.org/record/293719
39. Defect Data**
• Due to heterogeneity of systems, no common unit testing suit is
available :(
• 2 rounds of acceptance testing for each of the 12 projects
• Defects were recorded in Trac after each acceptance testing
++original SVN repo and Trac instances are
available upon request
**Available at https://zenodo.org/record/293719
40. Defect Data**
• Due to heterogeneity of systems, no common unit testing suit is
available :(
• 2 rounds of acceptance testing for each of the 12 projects
• Defects were recorded in Trac after each acceptance testing
• Trac was too tightly-integrated with SVN, therefore not possible to
install on a server
++
++original SVN repo and Trac instances are
available upon request
**Available at https://zenodo.org/record/293719
41. Defect Data**
• Due to heterogeneity of systems, no common unit testing suit is
available :(
• 2 rounds of acceptance testing for each of the 12 projects
• Defects were recorded in Trac after each acceptance testing
• Trac was too tightly-integrated with SVN, therefore not possible to
install on a server
++
• 12 reports extracted from Trac:
++original SVN repo and Trac instances are
available upon request
**Available at https://zenodo.org/record/293719
42. Defect Data**
• Due to heterogeneity of systems, no common unit testing suit is
available :(
• 2 rounds of acceptance testing for each of the 12 projects
• Defects were recorded in Trac after each acceptance testing
• Trac was too tightly-integrated with SVN, therefore not possible to
install on a server
++
• 12 reports extracted from Trac:
Defects_Dev{1/2/3/4/5/6}_Sys{A/B/C/D}.xlsx
++original SVN repo and Trac instances are
available upon request
**Available at https://zenodo.org/record/293719
43. Defect Data**
• Due to heterogeneity of systems, no common unit testing suit is
available :(
• 2 rounds of acceptance testing for each of the 12 projects
• Defects were recorded in Trac after each acceptance testing
• Trac was too tightly-integrated with SVN, therefore not possible to
install on a server
++
• 12 reports extracted from Trac:
Defects_Dev{1/2/3/4/5/6}_Sys{A/B/C/D}.xlsx
•
++original SVN repo and Trac instances are
available upon request
**Available at https://zenodo.org/record/293719
45. Task Dates**
A problem in longitudinal, brown-field study: limits between tasks become “blurry”
**Available at https://zenodo.org/record/293719
46. Task Dates**
A problem in longitudinal, brown-field study: limits between tasks become “blurry”
Examples:
**Available at https://zenodo.org/record/293719
47. Task Dates**
A problem in longitudinal, brown-field study: limits between tasks become “blurry”
Examples:
Developer finishes Task 3 in System 1 in the morning, and moves on to
Task 1 for System 2 in the afternoon.
**Available at https://zenodo.org/record/293719
48. Task Dates**
A problem in longitudinal, brown-field study: limits between tasks become “blurry”
Examples:
Developer finishes Task 3 in System 1 in the morning, and moves on to
Task 1 for System 2 in the afternoon.
Developer was working on Task 2, but then forgot to change something in
Task 1, so switch temporary between tasks.
**Available at https://zenodo.org/record/293719
49. Task Dates**
A problem in longitudinal, brown-field study: limits between tasks become “blurry”
Examples:
Developer finishes Task 3 in System 1 in the morning, and moves on to
Task 1 for System 2 in the afternoon.
Developer was working on Task 2, but then forgot to change something in
Task 1, so switch temporary between tasks.
We used different sources to estimate the Dates in which a developer was
working on a given System and a given Task.
**Available at https://zenodo.org/record/293719
50. Task Dates**
A problem in longitudinal, brown-field study: limits between tasks become “blurry”
Examples:
Developer finishes Task 3 in System 1 in the morning, and moves on to
Task 1 for System 2 in the afternoon.
Developer was working on Task 2, but then forgot to change something in
Task 1, so switch temporary between tasks.
We used different sources to estimate the Dates in which a developer was
working on a given System and a given Task.
Project context
Daily interviews
Audio files/notes
Subversion
database
Defects*
Development
Technology
Change
Size**
Effort**
Maintenance outcomes
Think aloud
Video files/notes
Task
progress
sheets
Eclipse
activity
logs
Trac (Issue tracker),
Acceptance test
reports
Open interviews
Audio files/notes
Maintainability
perception*
Maintenance
problems**
oud
/notes
Study
diary
**Available at https://zenodo.org/record/293719
51. Task Dates**
A problem in longitudinal, brown-field study: limits between tasks become “blurry”
Examples:
Developer finishes Task 3 in System 1 in the morning, and moves on to
Task 1 for System 2 in the afternoon.
Developer was working on Task 2, but then forgot to change something in
Task 1, so switch temporary between tasks.
We used different sources to estimate the Dates in which a developer was
working on a given System and a given Task.
Project context
Daily interviews
Audio files/notes
Subversion
database
Defects*
Development
Technology
Change
Size**
Effort**
Maintenance outcomes
Think aloud
Video files/notes
Task
progress
sheets
Eclipse
activity
logs
Trac (Issue tracker),
Acceptance test
reports
Open interviews
Audio files/notes
Maintainability
perception*
Maintenance
problems**
oud
/notes
Study
diary
**Available at https://zenodo.org/record/293719
52. Task Dates**
A problem in longitudinal, brown-field study: limits between tasks become “blurry”
Examples:
Developer finishes Task 3 in System 1 in the morning, and moves on to
Task 1 for System 2 in the afternoon.
Developer was working on Task 2, but then forgot to change something in
Task 1, so switch temporary between tasks.
We used different sources to estimate the Dates in which a developer was
working on a given System and a given Task.
Project context
Daily interviews
Audio files/notes
Subversion
database
Defects*
Development
Technology
Change
Size**
Effort**
Maintenance outcomes
Think aloud
Video files/notes
Task
progress
sheets
Eclipse
activity
logs
Trac (Issue tracker),
Acceptance test
reports
Open interviews
Audio files/notes
Maintainability
perception*
Maintenance
problems**
oud
/notes
Study
diary
Task
Dates
**Available at https://zenodo.org/record/293719
55. Potential usage scenarios
a) Analysis of “repeated defects” in a
multiple case study
b) Studies on the impact of different
metrics/attributes on software evolution
56. Potential usage scenarios
a) Analysis of “repeated defects” in a
multiple case study
b) Studies on the impact of different
metrics/attributes on software evolution
c) Further studies on inter-smell relations
57. Potential usage scenarios
a) Analysis of “repeated defects” in a
multiple case study
b) Studies on the impact of different
metrics/attributes on software evolution
c) Further studies on inter-smell relations
d) Cost-benefit analysis of code smell
removal
58. Potential usage scenarios
a) Analysis of “repeated defects” in a
multiple case study
b) Studies on the impact of different
metrics/attributes on software evolution
c) Further studies on inter-smell relations
d) Cost-benefit analysis of code smell
removal
e) Benchmarking of diverse tools/
methodologies
59. Potential usage scenarios
a) Analysis of “repeated defects” in a
multiple case study
b) Studies on the impact of different
metrics/attributes on software evolution
c) Further studies on inter-smell relations
d) Cost-benefit analysis of code smell
removal
e) Benchmarking of diverse tools/
methodologies
f) Task/context extraction, alongside
ideas by [2]
[2] M. Barnett, et al., “Helping Developers Help Themselves: Automatic Decomposition
of Code Review Change-sets,” (ICSE ’15)
62. What to consider when using the data..
A. Context of the study
B. Tasks were individual
63. What to consider when using the data..
A. Context of the study
B. Tasks were individual
C. Time frame is approx. 1-2 sprints
64. What to consider when using the data..
A. Context of the study
B. Tasks were individual
C. Time frame is approx. 1-2 sprints
D. The age of the systems (+10 years)
65. What to consider when using the data..
A. Context of the study
B. Tasks were individual
C. Time frame is approx. 1-2 sprints
D. The age of the systems (+10 years)
E. Tool for code smells not available
66. What to consider when using the data..
A. Context of the study
B. Tasks were individual
C. Time frame is approx. 1-2 sprints
D. The age of the systems (+10 years)
E. Tool for code smells not available
F. No explicit corrective tasks
67. What to consider when using the data..
A. Context of the study
B. Tasks were individual
C. Time frame is approx. 1-2 sprints
D. The age of the systems (+10 years)
E. Tool for code smells not available
F. No explicit corrective tasks
G. Date accuracy for the tasks
68. What to consider when using the data..
A. Context of the study
B. Tasks were individual
C. Time frame is approx. 1-2 sprints
D. The age of the systems (+10 years)
E. Tool for code smells not available
F. No explicit corrective tasks
G. Date accuracy for the tasks
H. Not all the commit logs were associated with an issue ID
69. What to consider when using the data..
A. Context of the study
B. Tasks were individual
C. Time frame is approx. 1-2 sprints
D. The age of the systems (+10 years)
E. Tool for code smells not available
F. No explicit corrective tasks
G. Date accuracy for the tasks
H. Not all the commit logs were associated with an issue ID
I. Consider the trade-off between the degree of realism and the
degree of control in such type of studies
71. Trade-off between
realism and control
Sample size (Big Data)DataRichness(ThickData)
Low High
Low
High
Controlled/Lab
Experiments
72. Trade-off between
realism and control
Sample size (Big Data)DataRichness(ThickData)
Low High
Low
High
Case studies
Controlled/Lab
Experiments
73. Trade-off between
realism and control
Sample size (Big Data)DataRichness(ThickData)
Low High
Low
High
Case studies
Controlled/Lab
Experiments
Ethnography
74. Trade-off between
realism and control
Sample size (Big Data)DataRichness(ThickData)
Low High
Low
High
Case studies
Repository
Analysis (OSS)
Controlled/Lab
Experiments
Ethnography
75. Trade-off between
realism and control
Sample size (Big Data)DataRichness(ThickData)
Low High
Low
High
Case studies
Repository
Analysis (OSS)
Controlled/Lab
Experiments
Our study?
Ethnography
76. Trade-off between
realism and control
Sample size (Big Data)DataRichness(ThickData)
Low High
Low
High
Case studies
Repository
Analysis (OSS)
Controlled/Lab
Experiments
Our study?
Mega-cross-project
experiments?
Ethnography
79. Experimental Replication Applied to Case Study [1]
Context Context
Case 1 Case 2
Literal Replication
≈
Same Tasks
Developers with similar skills
Same project setting
Same technology
Case 2
Code
Smells
System A
Code
Smells
System A
≈
Maintenance
outcomes
Maintenance
outcomes
System ASystem A
Same Systems
Context Context
Case 1 Case 2
Maintenance
outcomes
Theoretical Replication
≠
Same Tasks
Developers with similar skills
Same project setting
Same technology
Case 3
Code
Smells
System A
Code
Smells
System B
≠
Maintenance
outcomes
System BSystem A
Different Systems
80. Experimental Replication Applied to Case Study [1]
Context Context
Case 1 Case 2
Literal Replication
≈
Same Tasks
Developers with similar skills
Same project setting
Same technology
Case 2
Code
Smells
System A
Code
Smells
System A
≈
Maintenance
outcomes
Maintenance
outcomes
System ASystem A
Same Systems
Context Context
Case 1 Case 2
Maintenance
outcomes
Theoretical Replication
≠
Same Tasks
Developers with similar skills
Same project setting
Same technology
Case 3
Code
Smells
System A
Code
Smells
System B
≠
Maintenance
outcomes
System BSystem A
Different Systems
[1] Yamashita, 2012: “Assessing the capability of code smells to support software maintainability
assessments: Empirical inquiry and methodological approach” PhD Thesis