This document discusses research on partitioning composite code changes to facilitate code review. The researchers investigated the prevalence of composite changes, proposed an approach to partition such changes based on formatting, dependencies and similarities, and evaluated their approach on changes from four Java projects. Their approach was able to correctly partition 69% of composite changes. A preliminary user study also found that partitioning helped developers better review changes compared to reviewing by file. The researchers' goal is to improve the review process for composite changes which are common but more difficult to review.
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Sung Kim
Yida's presentation at MSR 2015!
Abstract—Developers expend significant effort on reviewing source code changes, hence the comprehensibility of code changes directly affects development productivity. Our prior study has suggested that composite code changes, which mix multiple development issues together, are typically difficult to review. Unfortunately, our manual inspection of 453 open source code changes reveals a non-trivial occurrence (up to 29%) of such composite changes.
In this paper, we propose a heuristic-based approach to automatically partition composite changes, such that each sub-change in the partition is more cohesive and self-contained. Our quantitative and qualitative evaluation results are promising in demonstrating the potential benefits of our approach for facilitating code review of composite code changes.
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Sung Kim
Yida's presentation at MSR 2015!
Abstract—Developers expend significant effort on reviewing source code changes, hence the comprehensibility of code changes directly affects development productivity. Our prior study has suggested that composite code changes, which mix multiple development issues together, are typically difficult to review. Unfortunately, our manual inspection of 453 open source code changes reveals a non-trivial occurrence (up to 29%) of such composite changes.
In this paper, we propose a heuristic-based approach to automatically partition composite changes, such that each sub-change in the partition is more cohesive and self-contained. Our quantitative and qualitative evaluation results are promising in demonstrating the potential benefits of our approach for facilitating code review of composite code changes.
The Impact of Test Ownership and Team Structure on the Reliability and Effect...Kim Herzig
Context: Software testing is a crucial step in most software development processes. Testing software is a key component to manage and assess the risk of shipping quality products to customers. But testing is also an expensive process and changes to the system need to be tested thoroughly which may take time. Thus, the quality of a software product depends on the quality of its underlying testing process and on the effectiveness and reliability of individual test cases.
Goal: In this paper, we investigate the impact of the organizational structure of test owners on the reliability and effectiveness of the corresponding test cases. Prior empirical research on organizational structure has focused only on developer activity. We expand the scope of empirical knowledge by assessing the impact of organizational structure on testing activities.
Method: We performed an empirical study on the Windows build verification test suites (BVT) and relate effectiveness and reliability measures of each test run to the complexity and size of the organizational sub-structure that enclose all owners of test cases executed.
Results: Our results show, that organizational structure impacts both test effectiveness and test execution reliability. We are also able to predict effectiveness and reliability with fairly high precision and recall values.
Conclusion: We suggest to review test suites with respect to their organizational composition. As indicated by the results of this study, this would increase the effectiveness and reliability, development speed and developer satisfaction.
More details:
ESEM 2014 presentation for paper "The Impact of Test Ownership and Team Structure on the Reliability and Effectiveness of Quality Test Runs". For more details please see http://dl.acm.org/citation.cfm?id=2652524.2652535&coll=DL&dl=GUIDE&CFID=569962862&CFTOKEN=20804180.
Review Participation in Modern Code Review: An Empirical Study of the Android...The University of Adelaide
This work empirically investigates the factors influence review participation in the MCR process. Through a case study of the Android, Qt, and OpenStack open source projects, we find that the amount of review participation in the past is a significant indicator of patches that will suffer from poor review participation. Moreover, the description length of a patch and the purpose of introducing new features also share a relationship with the likelihood of receiving poor review participation.
This full article of this work is published in the Empirical Software Engineering journal. Available online at http://dx.doi.org/10.1007/s10664-016-9452-6
Who Should Review My Code? A file-location based code-reviewer recommendation approach for modern code review.
This research study is presented at the 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER2015)
Find more information and preprint at patanamon.com
This research study is presented at the 7th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE2014)
Abstract: Effectively performing code review increases the quality of software and reduces occurrence of defects. However, this requires reviewers with experiences and deep understandings of system code. Manual selection of such reviewers can be a costly and time-consuming task. To reduce this cost, we propose a reviewer recommendation algorithm determining file path similarity called FPS algorithm. Using three OSS projects as case studies, FPS algorithm was accurate up to 77.97%, which significantly outperformed the previous
approach.
Find more information and preprint at patanamon.com
These slides show an example of a presentation produced by students following the course LINGI2252 "Software Engineering: Measures and Maintenance", taught by Prof. Kim Mens to Master-level students in computer science and computer science engineering at the Louvain School of Engineering, UCL, Belgium. Throughout this course, students have to analyze the quality of an existing software system (and more specifically, its maintainability), understand the nature of some of the problems encountered when maintaining complex software systems, suggest appropriate solutions to improve reusability and maintainability of a software system, measure its quality and support its evolution and program in Smalltalk, a pure object-oriented programming language.
Exploring neXtProt data and beyond: A SPARQLing solutionneXtProt
neXtProt (www.nextprot.org) is a knowledge base for human proteins. At neXtProt, we use SPARQL, a semantic query language for databases to check, explore, and visualize our data. SPARQL queries are also used to check the quality and consistency of the data loaded at each release.
Revisiting Code Ownership and Its Relationship with Software Quality in the S...The University of Adelaide
This work was presented at The 38th International Conference on Software Engineering (ICSE2016).
Abstract: Code ownership establishes a chain of responsibility for modules in large software systems. Although prior work uncovers a link between code ownership heuristics and software quality, these heuristics rely solely on the authorship of code changes. In addition to authoring code changes, developers also make important contributions to a module by reviewing code changes. Indeed, recent work shows that reviewers are highly active in modern code review processes, often suggesting alternative solutions or providing updates to the code changes. In this paper, we complement traditional code ownership heuristics using code review activity. Through a case study of six releases of the large Qt and OpenStack systems, we find that: (1) 67%-86% of developers did not author any code changes for a module, but still actively contributed by reviewing 21%-39% of the code changes, (2) code ownership heuristics that are aware of reviewing activity share a relationship with software quality, and (3) the proportion of reviewers without expertise shares a strong, increasing relationship with the likelihood of having post-release defects. Our results suggest that reviewing activity captures an important aspect of code ownership, and should be included in approximations of it in future studies.
The Impact of Test Ownership and Team Structure on the Reliability and Effect...Kim Herzig
Context: Software testing is a crucial step in most software development processes. Testing software is a key component to manage and assess the risk of shipping quality products to customers. But testing is also an expensive process and changes to the system need to be tested thoroughly which may take time. Thus, the quality of a software product depends on the quality of its underlying testing process and on the effectiveness and reliability of individual test cases.
Goal: In this paper, we investigate the impact of the organizational structure of test owners on the reliability and effectiveness of the corresponding test cases. Prior empirical research on organizational structure has focused only on developer activity. We expand the scope of empirical knowledge by assessing the impact of organizational structure on testing activities.
Method: We performed an empirical study on the Windows build verification test suites (BVT) and relate effectiveness and reliability measures of each test run to the complexity and size of the organizational sub-structure that enclose all owners of test cases executed.
Results: Our results show, that organizational structure impacts both test effectiveness and test execution reliability. We are also able to predict effectiveness and reliability with fairly high precision and recall values.
Conclusion: We suggest to review test suites with respect to their organizational composition. As indicated by the results of this study, this would increase the effectiveness and reliability, development speed and developer satisfaction.
More details:
ESEM 2014 presentation for paper "The Impact of Test Ownership and Team Structure on the Reliability and Effectiveness of Quality Test Runs". For more details please see http://dl.acm.org/citation.cfm?id=2652524.2652535&coll=DL&dl=GUIDE&CFID=569962862&CFTOKEN=20804180.
Review Participation in Modern Code Review: An Empirical Study of the Android...The University of Adelaide
This work empirically investigates the factors influence review participation in the MCR process. Through a case study of the Android, Qt, and OpenStack open source projects, we find that the amount of review participation in the past is a significant indicator of patches that will suffer from poor review participation. Moreover, the description length of a patch and the purpose of introducing new features also share a relationship with the likelihood of receiving poor review participation.
This full article of this work is published in the Empirical Software Engineering journal. Available online at http://dx.doi.org/10.1007/s10664-016-9452-6
Who Should Review My Code? A file-location based code-reviewer recommendation approach for modern code review.
This research study is presented at the 22nd IEEE International Conference on Software Analysis, Evolution, and Reengineering (SANER2015)
Find more information and preprint at patanamon.com
This research study is presented at the 7th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE2014)
Abstract: Effectively performing code review increases the quality of software and reduces occurrence of defects. However, this requires reviewers with experiences and deep understandings of system code. Manual selection of such reviewers can be a costly and time-consuming task. To reduce this cost, we propose a reviewer recommendation algorithm determining file path similarity called FPS algorithm. Using three OSS projects as case studies, FPS algorithm was accurate up to 77.97%, which significantly outperformed the previous
approach.
Find more information and preprint at patanamon.com
These slides show an example of a presentation produced by students following the course LINGI2252 "Software Engineering: Measures and Maintenance", taught by Prof. Kim Mens to Master-level students in computer science and computer science engineering at the Louvain School of Engineering, UCL, Belgium. Throughout this course, students have to analyze the quality of an existing software system (and more specifically, its maintainability), understand the nature of some of the problems encountered when maintaining complex software systems, suggest appropriate solutions to improve reusability and maintainability of a software system, measure its quality and support its evolution and program in Smalltalk, a pure object-oriented programming language.
Exploring neXtProt data and beyond: A SPARQLing solutionneXtProt
neXtProt (www.nextprot.org) is a knowledge base for human proteins. At neXtProt, we use SPARQL, a semantic query language for databases to check, explore, and visualize our data. SPARQL queries are also used to check the quality and consistency of the data loaded at each release.
Revisiting Code Ownership and Its Relationship with Software Quality in the S...The University of Adelaide
This work was presented at The 38th International Conference on Software Engineering (ICSE2016).
Abstract: Code ownership establishes a chain of responsibility for modules in large software systems. Although prior work uncovers a link between code ownership heuristics and software quality, these heuristics rely solely on the authorship of code changes. In addition to authoring code changes, developers also make important contributions to a module by reviewing code changes. Indeed, recent work shows that reviewers are highly active in modern code review processes, often suggesting alternative solutions or providing updates to the code changes. In this paper, we complement traditional code ownership heuristics using code review activity. Through a case study of six releases of the large Qt and OpenStack systems, we find that: (1) 67%-86% of developers did not author any code changes for a module, but still actively contributed by reviewing 21%-39% of the code changes, (2) code ownership heuristics that are aware of reviewing activity share a relationship with software quality, and (3) the proportion of reviewers without expertise shares a strong, increasing relationship with the likelihood of having post-release defects. Our results suggest that reviewing activity captures an important aspect of code ownership, and should be included in approximations of it in future studies.
Holding a conversation is quite a useful skill that some people do naturally but the rest of us need to work at. Here are some methods and ideas you can use to initiate and sustain a sparkling conversation! Hope this one helps!
Code review is one of the crucial software activities where developers and stakeholders collaborate with each other in order to assess software changes. Since code review processes act as a final gate for new software changes to be integrated into the software product, an intense collaboration is necessary in order to prevent defects and produce a high quality of software products. Recently, code review analytics has been implemented in projects (for example, StackAnalytics4 of the OpenStack project) to monitor the collaboration activities between developers and stakeholders in the code review processes. Yet, due to the large volume of software data, code review analytics can only report a static summary (e.g., counting), while neither insights nor instant suggestions are provided. Hence, to better gain valuable insights from software data and help software projects make a better decision, we conduct an empirical investigation using statistical approaches. In particular, we use the large-scale data of 196,712 reviews spread across the Android, Qt, and OpenStack open source projects to train a prediction model in order to uncover the relationship between the characteristics of software changes and the likelihood of having poor code review collaborations. We extract 20 patch characteristics which are grouped along five dimensions, i.e., software changes properties, review participation history, past involvement of a code author, past involvement of reviewers, and review environment dimensions. To validate our findings, we use the bootstrap technique which repeats the experiment 1,000 times. Due to the large volume of studied data, and an intensive computation of characteristic extraction and find- ing validation, the use of the High-Performance-Computing (HPC) re- sources is mandatory to expedite the analysis and generate insights in a timely manner. Through our case study, we find that the amount of review participation in the past and the description length of software changes are a significant indicator that new software changes will suffer from poor code review collaborations [2017]. Moreover, we find that the purpose of introducing new features can increase the likelihood that new software changes will receive late collaboration from reviewers. Our findings highlight the need for the policies of software change submission that monitor these characteristics in order to help software projects improve the quality of code reviews processes. Moreover, based on our findings, future work should develop real-time code review analytics implemented on HPC resources in order to instantly provide insights and suggestions to software projects
A Tool for Optimizing Java 8 Stream Software via Automated RefactoringRaffi Khatchadourian
Streaming APIs are pervasive in mainstream Object-Oriented languages and platforms. For example, the Java 8 Stream API allows for functional-like, MapReduce-style operations in processing both finite, e.g., collections, and infinite data structures. However, using this API efficiently involves subtle considerations like determining when it is best for stream operations to run in parallel, when running operations in parallel can actually be less efficient, and when it is safe to run in parallel due to possible lambda expression side-effects. In this paper, we describe the engineering aspects of an open source automated refactoring tool called Optimize Streams that assists developers in writing optimal stream software in a semantics-preserving fashion. Based on a novel ordering and typestate analysis, the tool is implemented as a plug-in to the popular Eclipse IDE, using both the WALA and SAFE frameworks. The tool was evaluated on 11 Java projects consisting of ~642 thousand lines of code, where we found that 36.31% of candidate streams were refactorable, and an average speedup of 1.55 on a performance suite was observed. We also describe experiences gained from integrating three very different static analysis frameworks to provide developers with an easy-to-use interface for optimizing their stream code to its full potential.
TMPA-2015: The Application of Parameterized Hierarchy Templates for Automated...Iosif Itkin
The Application of Parameterized Hierarchy Templates for Automated Program Code Defect-Fixing
Artyom Aleksyuk, Vladimir Itsykson,Peter The Great Saint Petersburg Polytechnic University, Saint Petersburg
12 - 14 November 2015
Tools and Methods of Program Analysis in St. Petersburg
Synthesizing Knowledge from Software Development ArtifactsJeongwhan Choi
The content was created from "The Art and Science of Analyzing Software Data"
O Baysal, Kononenko, O. (Oleksii), Holmes, R. (Reid), and Godfrey, M.W. (Michael W.), “Synthesizing Knowledge from Software Development Artifacts”, 2015.
IncQuery-D: Distributed Incremental Model Queries over the Cloud: Engineerin...Daniel Varro
In model-driven software engineering (MDE), model queries are core technologies of many tool and transformation-specific challenges such as design rule validation, model synchronization, view maintenance, simulation and many more. As software models are rapidly increasing in size and complexity, traditional MDE tools frequently face scalability issues that decrease productivity of engineers and increase development costs. Incremental graph queries offer a graph pattern based language for capturing queries. Furthermore, the result set of a query is cached and incrementally maintained upon model changes to provide instantaneous query response time. In this talk, first a brief overview is given on the EMF-IncQuery framework (which is an official Eclipse subproject). Then we discuss how to incorporate incremental queries over a distributed cloud infrastructure (to scale up from a single-node tool to a cluster of nodes) deployed over popular database back-ends (such as Cassandra. 4store, Neo4J, etc). We present our first benchmarking experiments with IncQuery-D to highlight that distributed incremental model queries can perform significantly better than the native query technologies of the underlying database back-end, especially, for complex queries.
Working in teams is more effective than individual work, But the main obstacle that any corporate faces is the synchronization between each team, One of the functions that is affected by this obstacle is 'Coding', Working on massive and multidisciplinary projects which need the contribution of several teams specially at the coding phase is opposed by the miss coordination when running the mother code.
So corporate developed some tools to overcome this situation using code version control and Tracker System.
The talk is about a possibility of creation non-static code analyzer which will point a dev team to some unobvious bugs in the system based on machine learning technics.
We will become acquainted with Weka — the machine learning tool for exploratory development of the algorithm, also, we will take a closer look on basic algorithms for classification: Decision Trees, SVM and Naive Bayes classifier.
Real life examples will show you how the development life cycle can be improved with AI technics
Graph-Based Source Code Analysis of JavaScript Repositories Dániel Stein
A graph-based approach to analyze JavaScript source codes, using Neo4j as the graph database backend and ShapeSecurity Shift as the parser.
Hungarian version (presented at a Neo4j meetup): http://www.slideshare.net/steindani/forrskdtrak-grfalap-statikus-analzise
Presentation given at Javantura - 23 February 2019, Zagreb / Croatia:
Do your customers keep complaining about bugs in your software application? Does it take you too much time to implement new features? If yes, then you probably have issues with the quality of your application. Watch my presentation to find out what practical steps you can follow to improve the quality of your application!
Similar to Partitioning composite code changes to facilitate code review (20)
This presentation by Morris Kleiner (University of Minnesota), was made during the discussion “Competition and Regulation in Professions and Occupations” held at the Working Party No. 2 on Competition and Regulation on 10 June 2024. More papers and presentations on the topic can be found out at oe.cd/crps.
This presentation was uploaded with the author’s consent.
This presentation, created by Syed Faiz ul Hassan, explores the profound influence of media on public perception and behavior. It delves into the evolution of media from oral traditions to modern digital and social media platforms. Key topics include the role of media in information propagation, socialization, crisis awareness, globalization, and education. The presentation also examines media influence through agenda setting, propaganda, and manipulative techniques used by advertisers and marketers. Furthermore, it highlights the impact of surveillance enabled by media technologies on personal behavior and preferences. Through this comprehensive overview, the presentation aims to shed light on how media shapes collective consciousness and public opinion.
0x01 - Newton's Third Law: Static vs. Dynamic AbusersOWASP Beja
f you offer a service on the web, odds are that someone will abuse it. Be it an API, a SaaS, a PaaS, or even a static website, someone somewhere will try to figure out a way to use it to their own needs. In this talk we'll compare measures that are effective against static attackers and how to battle a dynamic attacker who adapts to your counter-measures.
About the Speaker
===============
Diogo Sousa, Engineering Manager @ Canonical
An opinionated individual with an interest in cryptography and its intersection with secure software development.
Acorn Recovery: Restore IT infra within minutesIP ServerOne
Introducing Acorn Recovery as a Service, a simple, fast, and secure managed disaster recovery (DRaaS) by IP ServerOne. A DR solution that helps restore your IT infra within minutes.
4. Atomic Code Change
Fixed bug #12, #34, #56
Removed duplicate code
Add a feature
Javadoc updated
Composite Code Change
Fixed bug #123
Difficult to review
Likely be rejected
5. Research Questions
• RQ1: Are composite code changes prevalent?
• RQ2: Can we propose an approach to improve the semantic atomicity
of composite code changes?
• RQ3: Can our approach help developers better review composite code
changes?
6. RQ1: Occurrence of composite code changes
• Data source
• 4 open-source Java projects
• Revisions that changed >= 2 lines of code
• Commit logs and source code were manually inspected
6
Time period Revisions Avg. cLOC Avg. files
Ant 2010/04/27 -- 2012/03/05 137 26.1 2.0
Commons Math 2011/11/28 -- 2012/04/12 107 84.7 3.5
Xerces 2008/11/03 -- 2012/03/13 116 63.6 3.0
JFreeChart 2008/07/02 -- 2010/03/30 93 144.9 4.1
Total revisions 453
10. Approach
Partition of the patch
Unix diff
(text differencing)
ChangeDistiller*
(AST differencing)
*http://www.ifi.uzh.ch/seal/research
/tools/changeDistiller.html
Formatting
changes
10
Formatting
Dependency
Similarity
A set of changed
statements
A composite
code change
11. Approach
Partition of the patch
IBM T.J. Watson
Libraries for Analysis
(WALA)*
Inter-procedural
Backward static slicing
*http://wala.sourceforge.net/wiki/in
dex.php/Main_Page
11
Formatting
Dependency
Similarity
A set of changed
statements
A composite
code change
12. Approach
Partition of the patch“protect array entries against
corruption by returning a clone”
Same change type
Similar delta
12
Formatting
Dependency
Similarity
A set of changed
statements
A composite
code change
P
P’
13.
14. Evaluation
• 78 composite code changes from the previously inspected data
• 3 human evaluators establish manual partitions for these changes
• Automatic partition results are compared to manual partitions
• Considered acceptableif it exactly matched the manual partition
82%
13%
5%
Xerces
92%
7%
1%
Ant
82%
12%
6%
Commons Math
71%
18%
11%
JFreeChart
1 issue
2 issues
> 2 issues
15. Evaluation
• 78 composite code changes from the previously inspected data
• 3 human evaluators establish manual partitions for these changes
• Automatic partition results are compared to manual partitions
• Considered acceptableif it exactly matched the manual partition
82%
13%
5%
Xerces
92%
7%
1%
Ant
82%
12%
6%
Commons Math
71%
18%
11%
JFreeChart
1 issue
2 issues
> 2 issues
16. Evaluation
• 78 composite code changes from the previously inspected data
• 3 human evaluators establish manual partitions for these changes
• Automatic partition results are compared to manual partitions
• Considered acceptableif it exactly matched the manual partition
Acceptable # / Total #
Ant 8 / 11
Commons Math 10 / 19
Xerces 16 / 21
JFreeChart 20 / 27
54 / 78 (69%)
17. Ant revision 943068 (24 changed LOC)
“Wrong assignment after I renamed the parameter. Unfortunately
there doesn’t seem to be a testcase that catches the error.”
Later fixed in revision 943070
17
One of the two change-slices after partitioning
18. Preliminary User Study
• RQ3
• Can our automatic partition help developers better review composite changes?
• Participants
• 18 CS graduate students
• Task
• Participants review 12 composite code changes
• Answer a series of code review questions [1], e.g.,
• “What is the consequence of removing the schemaType field?”
• “What do changes in these files have in common?”
18
[1]“Questions programmers ask during software evolution tasks” Sillito et al. FSE 2006
23. Related Work
• Helping developers help themselves: Automatic decomposition of
code review changesets. Barnett et al. ICSE 2015
• The industrial perspective of composite code changes
• The impact of tangled code changes. Kim Herzig and Andreas Zeller.
MSR 2013
• Filtering Noise in Mixed-Purpose Fixing Commits to Improve Defect
Prediction and Localization. Nguyen et al. ISSRE 2013
• How composite code changes affect defect prediction
24. Acknowledgment
• We thank all the students that participated in our user study.
• We especially thank Tao He and Hai Wan for their kind help of
arranging the user study.
• We also thank Ananya Kanjilal for her comments on the paper draft.
Editor's Notes
The focus of this session, and of course this work, is code review, which in modern software development typically practiced on lightweight code changes.
Characteristics of changes directly affect how code review is practice
Considering two types of code changes/commits/revisions.
The second change fixed 3 bugs, did some refactoring….
Which one is more acceptable from a code reviewers’ perspective?
From our previous study, we found evidences that CCC are … compared to atomic changes.
And this is the particular problem we try to tackle with in this work
Specifically, we want to answer these 3 questions.
We first investigate whether the problem of CCC is really prominent by taking a look the revision history of these 4 OS projects.
We inspected revisions that changed at least 2 lines of code within this time period, that results in 453 revisions in total
We mainly inspect the commit log to see how many logical issues are addressed.
Filter criteria: multiple lines of source code changes, compilable,
Here is what we observed. Blue means the revision is atomic, that is, it addresses one single issue. And orange and grey mean that the revision addresses 2 or more than 2 issues.
Among the revisions that changed multiple source code lines, from 8% - 29% , on average 17% patches are composite, that is, they address 2 or more logical issues.
non-trivial
So next, we propose an approach to partition a composite code change, so that in the end each subset is more semantically cohesive and hopefully easier to review.
First, a code change is modeled as a set of changed statements.
Basically, we identify relations between statements to derive change partition. Statements in the same partition (change-slice) are related while statements in different subsets are not
Now the core of our approach is this black box: how do we …
We use 3 heuristics here
First, we believe that formatting changes should be on their own instead of mingled with other type of changes, such as bug fixes.
So we compute textual and AST differencing results of the change.
Changed lines that are not identified in AST differencing but in the textual differencing are considered as formatting changes
Second, we believe that changed statements that have static dependency should belong to the same subset.
We use WALA to perform … on changed lines so that we know whether two changes have static dependency.
Third, we believe that changes with similar patterns are also related.
For example, these two functions both change the returning of a variable the same way. Although they don’t have static dependency, it’s obvious that they are serving the same purpose…
So, as the 3rd heuristic, we check if two changes have same change type (e.g., if they both update the return statement, which we can tell from its AST node type)
If so, whether they have string delta is above a threshold (0.8 by python library default)
Here is an example of how our approach works. Before the change and after the change. Some code is added with the plus sign, some is deleted with the minus sign, others are modified with the caret sign
Based on our heuristics, this change is partitioned into 3 change-slices, marked with different colors.
First, the orange part is formatting changes
Another change-slice is the green part of the change, …
Finally, these two added methods also form a partition cuz their method names are similar.
Remember in RQ1, we inspected over 400 revisions and some of them address multiple logical issues?
We used these 78 revisions, marked in yellow and grey, in our evaluation.
We compare the automatic partitions produced by our approach to the manual partitions. An automatic partition for a change is considered acceptable if …
We are not claiming that the manual partition by human evaluators is the only way of partitioning a change, because code semantics are subject to …. In other words, the solution space for partitioning is potentially infinite.
However, our goal is to facilitate code review. So if our automatic partition results match a common way of manual partition, then we say it’s acceptable.
Among the 78 CCC, 54 are automatically partitioned the same way as human evaluators did.
Yet, this brings up our third research question: does this automatic partition really help code review?
We’ve observed several cases that indicate the benefits of our approach. Here is one of them.
Our approach partitions a Ant revision, which changes 24 LOC, into 2 subsets. One subset pinpoints a buggy area, which was later fixed by developers.
Specifically, developers said “no testcase…”
We think that our patch partition may help make this bug more obvious and identifiable from a big, composite code change.
We also conduct a user study to investigate…?
We recruit … to review 12 composite … from our evaluation data and answer
There are 2 treatments.
The control condition is that participants review … by file, as they usually do
The experimental condition is that .. by partition
We analyze the correctness of their answers and the time they spent on coming up the answers.
We found that the group using partition to review patches performs significantly better than the control group in answering questions correctly, although they spend a slightly, but not significantly longer time.
This result is also promising in showing that our approach potentially help the patch review process
One-tailed Paired-samples t-test
Time no significant different. Participants even spent a slightly longer time under the by slice treatment. We conjecture that such a time increase mainly resulted from participants’ additional learning cost for using the unfamiliar “partitions” to understand patches.
69% composite changes are automatically partitioned the same way as human evaluators did
Our user study shows Participants’ code review effectiveness is significantly improved
Finally, I would like to bring up these issues that are highly relevant and important, yet we haven’t been able to address them in this work yet.
For example ….
I think these would to interesting to discussion and I’ll leave it to the discussion slot.
Cost: learning cost, execution cost
Hypothesis: size and complexity