Slides of the paper:
Muhammad Ilyas Azeem, Sebastiano Panichella, Andrea Di Sorbo, Alexander Serebrenik, and Qing Wang: Action-based Recommendation in Pull-request Development . International Conference on Software and System Processes (ICSSP2020).
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Action-based Recommendation in Pull-request Development
1. 1
Action-based Recommendation in Pull-request
Development
Institute of Software, Chinese Academy of SciencesInstitute of Software, Chinese Academy of Sciences
Muhammad Ilyas
Azeem
Sebastiano
Panichella
Alexander
Serebrenik Qing WangAndrea Di Sorbo
2. Popular GitHub Open-Source Projects
Receives numerous pull requests daily
E.g. Kubernetes receives more the 500 pull requests daily
2
3. Issues for integrator
Job of the integrator is critical
Ensure software quality
Communication with contributors
Manual selection of PRs:
Requires more effort & time
Especially when integrators have large workload & limited resources
3
4. Proposed Solution
CARTESIAN (aCceptance And Response classificaTion-based requESt
IdentifcAtioN)
CARTESIAN recommends three actions on PRs:
Accept, Response, and Reject
To implement CARTESIAN we followed two steps:
Feature Extraction Process
Classification model
4
5. Feature Extraction Process
PRs are crawled from 19 popular GitHub projects
Features have been extracted from the following four dimensions:
Pull request Project Contributors Integrator
5
7. Classification Model
CARTESIAN models the PR recommendation as a multi class
problem.
CARTESIAN recommends three actions on PRs:
Accept: These are the PRs accepted without any discussion
Respond: These are the PRs accepted after discussion with the contributors
Reject: These are the PRs which have not been accepted
7
8. Experimental Design
Dataset Overview
Crawled popular GitHub projects belonging to various domains and
programming languages
Pull requests time span (Project’s creation time to February 2018)
GitHub REST API V3
8
10. Experiment I (RQ1)
Seven classifiers have been trained:
Logistic Regression, SVM, Random Forest, Decision Trees, Naive Bayes,
K-Nearest Neighbor and XGBoost models
Features selection: using features importance analysis
Evaluation metrics: Accuracy, Recall, Precision, F-Measure
10
11. Experiment II (RQ2)
CARTESIAN Assessment:
1. Firstly, we compared CARTESIAN with baseline models, the
prioritizing criteria studied by Gousios et al.
FIFO model and Sized-Based Model (SBM)
2. Secondly we performed qualitatively analysis of top@20 PRs
Evaluation metrics: Mean Average Precision (MAP) and Average Recall (AR)
11
12. Results for RQ1
XGBoost outperformed the rest of the classifiers
XGBoost is selected as the ultimate classifier for
CARTESIAN
CARTESIAN achieved an average precision and recall
of 86%
12
13. Features Importance Analysis
Number of review & discussion
comments, the role of submitter, and
the number of participants in the
discussion are the most relevant
features
The classification accuracy is largely
driven by features in the Contributor
and Integrator dimensions.
13
15. Results for RQ(2)
Qualitative analysis shows that CARTESIAN recommends useful PRs
to the integrator e.g. bug fixes, new features requests etc.
15
16. Conclusion
CARTESIAN can be helpful for integrators of popular GitHub
projects
It has achieved better results: an average precision and recall of
about 86%
Besides, CARTESIAN prioritize useful PRs on the top of the list
16
17. Future Work
Our plan is to:
Integrator CARTESIAN to GitHub
Evaluate its usefulness, and
discover additional factors (quality metrics) that can be used to
improve the performance
17