The document discusses how task granularity at different levels (e.g. commits, pull requests, work items) can impact analyses of co-evolution in software projects. It finds that analyzing at the commit-level can overlook relationships between tasks that span multiple commits. Work item level analysis is recommended to provide a more complete view of co-evolution, as median of 29% of work items consist of multiple commits, and analyzing at the commit level would miss 24% of co-changed files and inability to group 83% of related commits.
Continual Delivery with Maven Microreposjgcloudbees
Slides from (unrecorded) talk delivered Mon Oct 14 to All Things Open 2019 by Jesse Glick of CloudBees on JEP-305 “Incrementals”: https://allthingsopen.org/talk/continual-delivery-with-maven-microrepos/
Continual Delivery with Maven Microreposjgcloudbees
Slides from (unrecorded) talk delivered Mon Oct 14 to All Things Open 2019 by Jesse Glick of CloudBees on JEP-305 “Incrementals”: https://allthingsopen.org/talk/continual-delivery-with-maven-microrepos/
Keynote VST2020 (Workshop on Validation, Analysis and Evolution of Software ...University of Antwerp
A keynote delivered for the 3rd Workshop on
Validation, Analysis and Evolution of Software Tests
February 18, 2020 | co-located with SANER 2020, London, Ontario, Canada.
http://vst2020.scch.at
Abstract - With the rise of agile development, software teams all over the world embrace faster release cycles as *the* way to incorporate customer feedback into product development processes. Yet, faster release cycles imply rethinking the traditional notion of software quality: agile teams must balance reliability (minimize known defects) against agility (maximize ease of change). This talk will explore the state-of-the-art in software test automation and the opportunities this may present for maintaining this balance. We will address questions like: Will our test suite detect critical defects early? If not, how can we improve our test suite? Where should we fix a defect? The research underpinning all of this has been validated under "in vivo" circumstances through the TESTOMAT project, a European project with 34 partners coming from 6 different countries.
Is software engineering research addressing software engineering problems?Gail Murphy
Keynote from Automated Software Engineering 2020. (See https://www.cs.ubc.ca/~murphy for video)
Brian Randell described software engineering as “the multi-person development of multi-version programs”. David Parnas has expressed that this “pithy phrase implies everything that differentiates software engineering from other programming”. How does current software engineering research compare against this definition? Is there currently too much focus on research into problems and techniques more associated with programming than software engineering? Are there opportunities to use Randell’s description of software engineering to guide the community to new research directions? In this talk, I will explore these questions and discuss how a consideration of the development streams used by multiple individuals to produce multiple versions of software opens up new avenues for impactful software engineering research.
Assertions are often used to test the assumptions that developers have about a program. An assertion contains a boolean expression which developers believe to be true at a particular program point. It throws an error if the expression is not satisfied, which helps developers to detect and correct bugs. Since assertions make developer assumptions explicit, assertions are also believed to improve understandability of code. Recently, Casalnuovo et al. analyse C and C++ programs to understand the relationship between assertion usage and defect occurrence. Their results show that asserts have a small effect on reducing the density of bugs and developers often add
asserts to methods they have prior knowledge of and larger ownership.
In this study, we perform a partial replication of the above study on a large dataset of Java projects from GitHub (185 projects, 20 million LOC, 4 million commits, 0.2 million files and 1 million methods). We collect metrics such as number of asserts, number of defects, number of developers and number of lines changed to a method, and examine the relationship between asserts and defect occurrence. We also analyse relationship between developer experience and ownership and the number of asserts. Furthermore, we perform a study of what are different types of asserts added and why they are added by developers. We find that asserts have a small yet significant relationship with defect occurrence and developers who have added asserts to methods often have higher ownership of and experience with the methods than developers who did not add asserts.
How much time it takes for my feature to arrive?Daniel Alencar
How much time it takes for a bug fix or a new feature be available to users? We did an empirical work to better understand what makes a new feature or bug fix to arrive faster to users
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...Pavneet Singh Kochhar
In this paper, we analyse two large software systems to
measure the relationship of code coverage and its effectiveness in killing real bugs from the software systems.
Presentation by Céline Deknop of the paper "Advanced Differencing of Legacy Code and Migration Logs" @SATToSE2020 (Virtual event).
Rediffusion of the presentation can be found here : https://www.youtube.com/watch?v=YJxPzWqW9DI&fbclid=IwAR3voPfFsp-ywRUrXOejW4oq8axlFAqbxGidNh2WMEE_VR-pb0diK3Cb05Y (around the 3h mark)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Tom Mens
Presentation at CHAOSSCon Europe 2020 about the generic technical lag software measurement framework. Technical lag measures the increasing difference between deployed software components and the ideal upstream software components.
For more information, see https://doi.org/10.1002/smr.2157
Ensuring OpenStack Version up Compatibility for CloudOpen Japan 2013-05-31Masayuki Igawa
These slides for CloudOpen Japan 2013 (05-31).
http://linuxconcloudopenjapan2013.sched.org/event/b0994396a7b878793f22cc4a0c5b27b7
And, you can download the same at http://events.linuxfoundation.jp/events/cloudopen-japan/program/presentations .
Help students get familiar with the basic concepts of DevOps processes and technologies and the challenges facing companies who are looking to embrace scalable software deployment.
[This workshop was given to TAU CS students over the years 2015-2016]
Testing Vue Apps with Cypress.io (STLJS Meetup April 2018)Christian Catalan
Presented at the STLJS Meetup (St Louis, MO)
We dive into a Vue application used in semiconductor labs for transistor measurements. We discuss how get started with E2E testing with Cypress.io. And give a crash course into Vue applications.
Video: https://www.youtube.com/watch?v=dpB0YgnFyZQ
Software release cycles are now measured in days instead of months. Cutting edge companies are continuously delivering high-quality software at a fast pace. In this session, we will cover how you can begin your DevOps journey by sharing best practices and tools used by the engineering teams at Amazon. We will showcase how you can accelerate developer productivity by implementing continuous Integration and delivery workflows. We will also cover an introduction to AWS CodeStar, AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, AWS CodeDeploy, AWS Cloud9, and AWS X-Ray the services inspired by Amazon's internal developer tools and DevOps practice.
Level: 200
Speaker: Nick Brandaleone - Solutions Architect, AWS
DevOps is a methodology capturing the practices adopted from the very start by the web giants who had a unique opportunity as well as a strong requirement to invent new ways of working due to the very nature of their business: the need to evolve their systems at an unprecedented pace as well as extend them and their business sometimes on a daily basis.
While DevOps makes obviously a critical sense for startups, I believe that the big corporations with large and old-fashioned IT departments are actually the ones that can benefit the most from adopting these principles and practices.
Studying the Integration Practices and the Evolution of Ad Libraries in the G...SAIL_QU
In-app advertisements have become a major revenue for app developers in the mobile app economy. Ad libraries play an integral part in this ecosystem as app
developers integrate these libraries into their apps to display ads. However, little is known about how app developers integrate these libraries with their apps and how these libraries have evolved over time.
In this thesis, we study the ad library integration practices and the evolution of such libraries. To understand the integration practices of ad libraries, we manually study apps and derive a set of rules to automatically identify four strategies for integrating
multiple ad libraries. We observe that integrating multiple ad libraries commonly occurs in apps with a large number of downloads and ones in categories with a high percentage of apps that display ads. We also observe that app developers prefer to manage their own integrations instead of using off the shelf features of ad libraries for integrating multiple ad libraries.
To study the evolution of ad libraries, we conduct a longitudinal study of the 8 most popular ad libraries. In particular, we look at their evolution in terms of size, the main drivers for releasing a new ad library version, and their architecture. We observe that ad libraries are continuously evolving with a median release interval of 34 days. Some ad libraries have grown exponentially in size (e.g., Facebook Audience Network ad library), while other libraries have worked to reduce their size. To study the main drivers for releasing an ad library version, we manually study the release notes of the eight studied ad libraries. We observe that ad library developers continuously update their ad libraries to support a wider range of Android versions (i.e., to ensure that more devices can use the libraries without errors). Finally, we derive a reference architecture for ad libraries and study how the studied ad libraries diverged from this architecture during our study period.
Our findings can assist ad library developers to understand the challenges for developing ad libraries and the desired features of these libraries.
Keynote VST2020 (Workshop on Validation, Analysis and Evolution of Software ...University of Antwerp
A keynote delivered for the 3rd Workshop on
Validation, Analysis and Evolution of Software Tests
February 18, 2020 | co-located with SANER 2020, London, Ontario, Canada.
http://vst2020.scch.at
Abstract - With the rise of agile development, software teams all over the world embrace faster release cycles as *the* way to incorporate customer feedback into product development processes. Yet, faster release cycles imply rethinking the traditional notion of software quality: agile teams must balance reliability (minimize known defects) against agility (maximize ease of change). This talk will explore the state-of-the-art in software test automation and the opportunities this may present for maintaining this balance. We will address questions like: Will our test suite detect critical defects early? If not, how can we improve our test suite? Where should we fix a defect? The research underpinning all of this has been validated under "in vivo" circumstances through the TESTOMAT project, a European project with 34 partners coming from 6 different countries.
Is software engineering research addressing software engineering problems?Gail Murphy
Keynote from Automated Software Engineering 2020. (See https://www.cs.ubc.ca/~murphy for video)
Brian Randell described software engineering as “the multi-person development of multi-version programs”. David Parnas has expressed that this “pithy phrase implies everything that differentiates software engineering from other programming”. How does current software engineering research compare against this definition? Is there currently too much focus on research into problems and techniques more associated with programming than software engineering? Are there opportunities to use Randell’s description of software engineering to guide the community to new research directions? In this talk, I will explore these questions and discuss how a consideration of the development streams used by multiple individuals to produce multiple versions of software opens up new avenues for impactful software engineering research.
Assertions are often used to test the assumptions that developers have about a program. An assertion contains a boolean expression which developers believe to be true at a particular program point. It throws an error if the expression is not satisfied, which helps developers to detect and correct bugs. Since assertions make developer assumptions explicit, assertions are also believed to improve understandability of code. Recently, Casalnuovo et al. analyse C and C++ programs to understand the relationship between assertion usage and defect occurrence. Their results show that asserts have a small effect on reducing the density of bugs and developers often add
asserts to methods they have prior knowledge of and larger ownership.
In this study, we perform a partial replication of the above study on a large dataset of Java projects from GitHub (185 projects, 20 million LOC, 4 million commits, 0.2 million files and 1 million methods). We collect metrics such as number of asserts, number of defects, number of developers and number of lines changed to a method, and examine the relationship between asserts and defect occurrence. We also analyse relationship between developer experience and ownership and the number of asserts. Furthermore, we perform a study of what are different types of asserts added and why they are added by developers. We find that asserts have a small yet significant relationship with defect occurrence and developers who have added asserts to methods often have higher ownership of and experience with the methods than developers who did not add asserts.
How much time it takes for my feature to arrive?Daniel Alencar
How much time it takes for a bug fix or a new feature be available to users? We did an empirical work to better understand what makes a new feature or bug fix to arrive faster to users
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...Pavneet Singh Kochhar
In this paper, we analyse two large software systems to
measure the relationship of code coverage and its effectiveness in killing real bugs from the software systems.
Presentation by Céline Deknop of the paper "Advanced Differencing of Legacy Code and Migration Logs" @SATToSE2020 (Virtual event).
Rediffusion of the presentation can be found here : https://www.youtube.com/watch?v=YJxPzWqW9DI&fbclid=IwAR3voPfFsp-ywRUrXOejW4oq8axlFAqbxGidNh2WMEE_VR-pb0diK3Cb05Y (around the 3h mark)
Measuring Technical Lag in Software Deployments (CHAOSScon 2020)Tom Mens
Presentation at CHAOSSCon Europe 2020 about the generic technical lag software measurement framework. Technical lag measures the increasing difference between deployed software components and the ideal upstream software components.
For more information, see https://doi.org/10.1002/smr.2157
Ensuring OpenStack Version up Compatibility for CloudOpen Japan 2013-05-31Masayuki Igawa
These slides for CloudOpen Japan 2013 (05-31).
http://linuxconcloudopenjapan2013.sched.org/event/b0994396a7b878793f22cc4a0c5b27b7
And, you can download the same at http://events.linuxfoundation.jp/events/cloudopen-japan/program/presentations .
Help students get familiar with the basic concepts of DevOps processes and technologies and the challenges facing companies who are looking to embrace scalable software deployment.
[This workshop was given to TAU CS students over the years 2015-2016]
Testing Vue Apps with Cypress.io (STLJS Meetup April 2018)Christian Catalan
Presented at the STLJS Meetup (St Louis, MO)
We dive into a Vue application used in semiconductor labs for transistor measurements. We discuss how get started with E2E testing with Cypress.io. And give a crash course into Vue applications.
Video: https://www.youtube.com/watch?v=dpB0YgnFyZQ
Software release cycles are now measured in days instead of months. Cutting edge companies are continuously delivering high-quality software at a fast pace. In this session, we will cover how you can begin your DevOps journey by sharing best practices and tools used by the engineering teams at Amazon. We will showcase how you can accelerate developer productivity by implementing continuous Integration and delivery workflows. We will also cover an introduction to AWS CodeStar, AWS CodeCommit, AWS CodeBuild, AWS CodePipeline, AWS CodeDeploy, AWS Cloud9, and AWS X-Ray the services inspired by Amazon's internal developer tools and DevOps practice.
Level: 200
Speaker: Nick Brandaleone - Solutions Architect, AWS
DevOps is a methodology capturing the practices adopted from the very start by the web giants who had a unique opportunity as well as a strong requirement to invent new ways of working due to the very nature of their business: the need to evolve their systems at an unprecedented pace as well as extend them and their business sometimes on a daily basis.
While DevOps makes obviously a critical sense for startups, I believe that the big corporations with large and old-fashioned IT departments are actually the ones that can benefit the most from adopting these principles and practices.
Similar to The Impact of Task Granularity on Co-evolution Analyses (20)
Studying the Integration Practices and the Evolution of Ad Libraries in the G...SAIL_QU
In-app advertisements have become a major revenue for app developers in the mobile app economy. Ad libraries play an integral part in this ecosystem as app
developers integrate these libraries into their apps to display ads. However, little is known about how app developers integrate these libraries with their apps and how these libraries have evolved over time.
In this thesis, we study the ad library integration practices and the evolution of such libraries. To understand the integration practices of ad libraries, we manually study apps and derive a set of rules to automatically identify four strategies for integrating
multiple ad libraries. We observe that integrating multiple ad libraries commonly occurs in apps with a large number of downloads and ones in categories with a high percentage of apps that display ads. We also observe that app developers prefer to manage their own integrations instead of using off the shelf features of ad libraries for integrating multiple ad libraries.
To study the evolution of ad libraries, we conduct a longitudinal study of the 8 most popular ad libraries. In particular, we look at their evolution in terms of size, the main drivers for releasing a new ad library version, and their architecture. We observe that ad libraries are continuously evolving with a median release interval of 34 days. Some ad libraries have grown exponentially in size (e.g., Facebook Audience Network ad library), while other libraries have worked to reduce their size. To study the main drivers for releasing an ad library version, we manually study the release notes of the eight studied ad libraries. We observe that ad library developers continuously update their ad libraries to support a wider range of Android versions (i.e., to ensure that more devices can use the libraries without errors). Finally, we derive a reference architecture for ad libraries and study how the studied ad libraries diverged from this architecture during our study period.
Our findings can assist ad library developers to understand the challenges for developing ad libraries and the desired features of these libraries.
Improving the testing efficiency of selenium-based load testsSAIL_QU
Slides for a paper published at AST 2019:
Shahnaz M. Shariff, Heng Li, Cor-Paul Bezemer, Ahmed E. Hassan, Thanh H. D. Nguyen, and Parminder Flora. 2019. Improving the testing efficiency of selenium-based load tests. In Proceedings of the 14th International Workshop on Automation of Software Test (AST '19). IEEE Press, Piscataway, NJ, USA, 14-20. DOI: https://doi.org/10.1109/AST.2019.00008
OpenMetadata Community Meeting - 5th June 2024OpenMetadata
The OpenMetadata Community Meeting was held on June 5th, 2024. In this meeting, we discussed about the data quality capabilities that are integrated with the Incident Manager, providing a complete solution to handle your data observability needs. Watch the end-to-end demo of the data quality features.
* How to run your own data quality framework
* What is the performance impact of running data quality frameworks
* How to run the test cases in your own ETL pipelines
* How the Incident Manager is integrated
* Get notified with alerts when test cases fail
Watch the meeting recording here - https://www.youtube.com/watch?v=UbNOje0kf6E
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Software Engineering, Software Consulting, Tech Lead, Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Transaction, Spring MVC, OpenShift Cloud Platform, Kafka, REST, SOAP, LLD & HLD.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeAftab Hussain
Understanding variable roles in code has been found to be helpful by students
in learning programming -- could variable roles help deep neural models in
performing coding tasks? We do an exploratory study.
- These are slides of the talk given at InteNSE'23: The 1st International Workshop on Interpretability and Robustness in Neural Software Engineering, co-located with the 45th International Conference on Software Engineering, ICSE 2023, Melbourne Australia
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
GraphSummit Paris - The art of the possible with Graph TechnologyNeo4j
Sudhir Hasbe, Chief Product Officer, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
The Impact of Task Granularity on Co-evolution Analyses
1. The Impact of Task Granularity on
Co-evolution Analyses
Yasutaka
Kamei
Keisuke
Miura
Shane
McIntosh
Naoyasu
Ubayashi
Ahmed E.
Hassan
2. Software evolution aims to recover
knowledge about development
2
Repositories Knowledge
GerritGit GitHub
RHSA Mylyn
3. Co-evolution of production &
test code
3
Growth history view of ArgoUML [1]
[1] A. Zaidman, B. Van Rompaey, S. Demeyer, and A. Van Deursen. Mining software repositories to study co-evolution of production & test code.
In Proc. Int’l Conf. on Software Testing, Verification, and Validation (ICST’08), pages 220–229, 2008.
10. Fix #1100
B
Some issues may require several
commits
5
Fix #1000
A
Test
A
Fix #1100
Test
B
1 day
later
11. Fix #1100
B
Some issues may require several
commits
5
Fix #1000
A
Test
ACommit-level analysis would miss the
co-change relationship between them
Fix #1100
Test
B
1 day
later
12. Work items can be used to study
software evolution
6
Jira
#1000 #1100
20. 12
System size ITS usage
Two important criteria that needed to
be satisfied to qualify for our analysis
21. 12
System size ITS usage
Two important criteria that needed to
be satisfied to qualify for our analysis
Jira
22. 13
System size ITS usage
100
75
50
25
0 10,000 20,000
# of commits
Two important criteria that needed to
be satisfied to qualify for our analysis
23. 13
System size ITS usage
100
75
50
25
0 10,000 20,000
# of commits
Two important criteria that needed to
be satisfied to qualify for our analysis
Jira
40. 21
QPID-4575: adds support for
Visual Studio 2012
Git
1st commit 5th commit
.cpp .h
This co-change activity of
production code and build
system would be missed
.cpproj
41. 22
The impact of the work item
granularity
File
Spread
Time
Spread
Developer
Spread
24% of the co-
changed files
are overlooked
42. 22
The impact of the work item
granularity
File
Spread
Time
Spread
Developer
Spread
24% of the co-
changed files
are overlooked
How much time elapses
between the commits
of work items?
43. Sliding time window technique
A common setting in software evolution
studies
23
Same commit message
Same developer
Similar time (300 secs)
44. Sliding time window technique
A common setting in software evolution
studies
23
A
Test
A
Fix #1000 Fix #1000
Git
Same commit message
Same developer
Similar time (300 secs)
45. Sliding time window technique
A common setting in software evolution
studies
23
A
Test
A
Fix #1000 Fix #1000
Git
< 300 secs
Same commit message
Same developer
Similar time (300 secs)
50. ACCUMULO-1890
Clean up the test
to avoid spinning
up a MAC
26
ACCUMULO-1890: recovers from a
failure due to limited resources
51. ACCUMULO-1890
Clean up the test
to avoid spinning
up a MAC
26
11 minutes later
ACCUMULO-1890: recovers from a
failure due to limited resources
52. ACCUMULO-1890
Clean up the test
to avoid spinning
up a MAC
26
ACCUMULO-1890
Forgot to re-add
changes before
commit
11 minutes later
ACCUMULO-1890: recovers from a
failure due to limited resources
53. 27
The impact of the work item
granularity
File
Spread
Time
Spread
Developer
Spread
24% of the co-
changed files
are overlooked
83% of related
commits cannot
be grouped
54. 27
The impact of the work item
granularity
File
Spread
Time
Spread
Developer
Spread
24% of the co-
changed files
are overlooked
83% of related
commits cannot
be grouped
How many developers
are involved across
revisions of a work item?
55. Sliding time window technique
A common setting in software evolution
studies
28
A
Test
A
Fix #1000 Fix #1000
Git
< 300 secs
Same commit message
Same developer
Similar time (300 secs)
61. 31
The impact of the work item
granularity
File
Spread
Time
Spread
Developer
Spread
24% of the co-
changed files
are overlooked
83% of related
commits cannot
be grouped
25% of work
items involve
multiple
developers
62. 32
[2]Q. Xuan and V. Filkov. Building it together: Synchronous development in OSS. In Proc. Int’l Conf. on Software Engineering (ICSE’14), pages 222–233, 2014.
A set of commits where one file is modified
by multiple developers within a time window
Synchronous development [2]
63. 32
[2]Q. Xuan and V. Filkov. Building it together: Synchronous development in OSS. In Proc. Int’l Conf. on Software Engineering (ICSE’14), pages 222–233, 2014.
A set of commits where one file is modified
by multiple developers within a time window
A
Synchronous development [2]
64. 33
[2]Q. Xuan and V. Filkov. Building it together: Synchronous development in OSS. In Proc. Int’l Conf. on Software Engineering (ICSE’14), pages 222–233, 2014.
A set of commits where different files are
modified by multiple developers under the
same work item
A
Test
A
#1000
Collaborative development
65. 34
Collaborative development
[2]Q. Xuan and V. Filkov. Building it together: Synchronous development in OSS. In Proc. Int’l Conf. on Software Engineering (ICSE’14), pages 222–233, 2014.
A set of commits where different files are
modified by multiple developers under the
same work item
A
Test
A
#1000
We investigate collaborative
work items that cannot be
detected as synchronous ones
66. This type of collaboration is not rare
27%-83% of collaborative work items
involve developers modifying different files
35
69. Median of 29% of work items
consist of two or more commits
Granularity may have a
considerable impact on
co-evolution analyses
70. Median of 29% of work items
consist of two or more commits
Granularity may have a
considerable impact on
co-evolution analyses
Studied systems
71. Median of 29% of work items
consist of two or more commits
Granularity may have a
considerable impact on
co-evolution analyses
Studied systems
The impact of the work item
granularity
File
Spread
Time
Spread
Developer
Spread
24% of the co-
changed files
are overlooked
83% of related
commits cannot
be grouped
25% of work
items involve
multiple
developers
72. Median of 29% of work items
consist of two or more commits
Granularity may have a
considerable impact on
co-evolution analyses
Studied systems
The impact of the work item
granularity
File
Spread
Time
Spread
Developer
Spread
24% of the co-
changed files
are overlooked
83% of related
commits cannot
be grouped
25% of work
items involve
multiple
developers
Given the impact that work item grouping,
we recommend that future software
evolution studies will be performed at the
work item level.