SlideShare a Scribd company logo
1 of 13
AN INSIGHT INTO THE PULL
REQUESTS OF GITHUB
Mohammad Masudur Rahman, Chanchal K. Roy
Department of Computer Science
University of Saskatchewan
11th Working Conference on Mining Software
Repositories(MSR 2014) (Challenge Track)
Hyderabad, India
RESEARCH PROBLEM: HIGHER RATE OF PULL
REQUEST FAILURE IN GITHUB
Base repo.
Forked repos.
Pull Requests
• 88 Base repositories
• 103,192+ Fork repos.
• 20,142 developers
• 78,955 Pull requests made in
4+ years.
• Only 42.95% pull request
commits merged
• About 57.05% pull
requests failed.
RQ: Why and how did those pull requests fail?
ASPECTS OF STUDY
 Studied and analyzed 7 aspects related to technical
problems in the pull requests, programming
languages, projects and developers.
 Technical issues in pull request commits
 Programming language
 Application domain
 Age of project
 Maturity of project
 Number of developers
 Experience of developers
WHICH TECHNICAL PROBLEMS DID HINDER THE
SUCCESS OF THE PULL REQUESTS?
2. Recursion & Refactoring (7.57%, 10.78%)
3. Database query execution (6.98%, 9.18%)
16. Arrays & functions (14.40%, 17.29%)
29. Actor model (7.11%, 5.11%)
31. OOP paradigm (7.12%, 9.17%)
33. Space & indentation (3.07%, 7.32%)
Arrays & functions
(31.69%)
Recursion & Refactoring
(18.35%)
Database query
execution (16.16%)
DID AN AVERAGE PROJECT FROM DIFFERENT
PROGRAMMING LANGUAGES SHOW DIFFERENT
BEHAVIOUR IN TERMS OF PULL REQUEST?
•Ruby (16.92/m, 40.11/m)
•PHP (21.72/m, 21.21/m)
•Java (2.75/m, 13.21/m)
•Scala (10.39/m, 4.08/m)
•C (11.89/m, 6.72/m)
•JavaScript (5.92/m, 14.87/m)
PHP(42.93/m)
Ruby (57.03/m)
Java(15.96/m)
WAS THERE A DOMAIN-SPECIFIC TREND IN
PULL REQUESTS?
•Framework (20.67/m, 15.49/m)
•IDE (19.43/m, 9.31/m)
•Client Apps (10.27/m, 6.37/m)
•Database (1.40/m, 3.94/m)
•Statistics(1.15/m, 0.80/m)
•Library(6.59/m, 9.18/m)
IDE
(28.84/m)
Framework
(36.16/m)
HOW DID PROJECT AGE AFFECT PULL
REQUEST RATE?
2012-2013
(43.12/m)
2009-2010
(19.34/m)
HOW DID PROJECT MATURITY AFFECT THE
PULL REQUEST RATE?
Forks: 3000-6800
(78.63/m)
•Forks: 500-1000 (7.43/m, 8.12/m)
•Forks: 2000-2500 (25.02/m, 35.29/m)
•Forks: 2500-3000 (18.01/m, 23.85/m)
•Forks: 3000-6800 (17.61/m, 61.02/m)
Forks: 1500-2000 (18/m)
HOW DID NO. OF DEVELOPERS OF A PROJECT
MATTER IN PULL REQUEST RATE?
Developer: 4000-4500
(247.1/m)
•#Developers: 400-500 (12.17/m, 15.39/m)
•#Developers: 500-1000 (25.14/m, 43.39/m)
•#Developers: 1000-2000 (73.23/m, 48.11/m)
•#Developers: 4000-4500 (2.23/m, 244.88/m)
DID DEVELOPER EXPERIENCE MATTER IN
PULL REQUEST RATE? Experience: 50-60 months
(390.54/m)
•Dev. Experience: 30-40 months (297.56/m, 222.14/m)
•Dev. Experience: 50-60 months (55.75/m, 334.79/m)
•Dev. Experience: 60-70 months (84.17/m, 65.63/m)
TAKE-AWAY MESSAGES
 57.05% of the pull requests failed. The issues that
failed the requests to merge are related to a limited
number of topics—recursion & refactoring,
database query execution, arrays & functions and
so on.
 Projects written in Java, JavaScript and Ruby
received exceptionally higher no. of failed pull
requests. PHP projects received almost equal no.
of successful and unsuccessful pull requests on
average per month.
 Projects from IDE and Framework domain showed
the maximum activities in terms of pull requests.
TAKE-AWAY MESSAGES
 As the age of a project increases, both merged and
failed pull request rates increase almost
proportionally.
 With the increase in forks, the average no. of pull
requests per month did not increase regularly.
However, projects with 2000+ forks received increased
amount of failed pull requests.
 With new participation(developers) in project, no. of
pull requests per month did not increase regularly.
However, a project with 4000+ developers received
excessive no. of failed pull requests.
 Projects with developers of 20-50 months experience
showed the maximum activities in terms of pull
requests.
THANK YOU!!

More Related Content

Similar to An insight into pull request failures on GitHub

Revisiting Assert Use in GitHub Projects
Revisiting Assert Use in GitHub ProjectsRevisiting Assert Use in GitHub Projects
Revisiting Assert Use in GitHub ProjectsPavneet Singh Kochhar
 
Model-Based Software Engineering: A Multiple-Case Study on Challenges and Dev...
Model-Based Software Engineering: A Multiple-Case Study on Challenges and Dev...Model-Based Software Engineering: A Multiple-Case Study on Challenges and Dev...
Model-Based Software Engineering: A Multiple-Case Study on Challenges and Dev...Rodi Jolak
 
Modern practices for the design and
Modern practices for the design andModern practices for the design and
Modern practices for the design andOgala Oscar
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyNeil Chue Hong
 
Software requirements engineering
Software requirements engineeringSoftware requirements engineering
Software requirements engineeringAbdul Basit
 
Requirements engineering tutorial with elicitation, negotiation, analysis and...
Requirements engineering tutorial with elicitation, negotiation, analysis and...Requirements engineering tutorial with elicitation, negotiation, analysis and...
Requirements engineering tutorial with elicitation, negotiation, analysis and...Bryan Len
 
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...Lionel Briand
 
Slides refsq'14 ds v1
Slides refsq'14 ds v1Slides refsq'14 ds v1
Slides refsq'14 ds v1GESSI UPC
 
#NEOTYSPAC performance testing shift left
#NEOTYSPAC performance testing shift left#NEOTYSPAC performance testing shift left
#NEOTYSPAC performance testing shift leftAmir Rozenberg
 
PERUMIN 31: Modern Practices for the Design and Planning Underground Mines
PERUMIN 31: Modern Practices for the Design and Planning Underground MinesPERUMIN 31: Modern Practices for the Design and Planning Underground Mines
PERUMIN 31: Modern Practices for the Design and Planning Underground MinesPERUMIN - Convención Minera
 
DevOps2018 Singapore Eliminating the dev versus ops mentality
DevOps2018 Singapore Eliminating the dev versus ops mentalityDevOps2018 Singapore Eliminating the dev versus ops mentality
DevOps2018 Singapore Eliminating the dev versus ops mentalityMirco Hering
 
Overview of Performance Based Design and AIT Experience
Overview of Performance Based Design and AIT ExperienceOverview of Performance Based Design and AIT Experience
Overview of Performance Based Design and AIT ExperienceAIT Solutions
 
Requirements validation techniques (rv ts) practiced in industry studies of...
Requirements validation techniques (rv ts) practiced in industry   studies of...Requirements validation techniques (rv ts) practiced in industry   studies of...
Requirements validation techniques (rv ts) practiced in industry studies of...JayabalanRajalakshmi
 
Rapid Strategic SRE Assessments
Rapid Strategic SRE AssessmentsRapid Strategic SRE Assessments
Rapid Strategic SRE AssessmentsMarc Hornbeek
 
Who Is A DevOps Engineer? | DevOps Skills You Must Master | DevOps Engineer M...
Who Is A DevOps Engineer? | DevOps Skills You Must Master | DevOps Engineer M...Who Is A DevOps Engineer? | DevOps Skills You Must Master | DevOps Engineer M...
Who Is A DevOps Engineer? | DevOps Skills You Must Master | DevOps Engineer M...Edureka!
 
Boost Your IT Career with IEEE's Software Engineering Certifications
Boost Your IT Career with IEEE's Software Engineering Certifications Boost Your IT Career with IEEE's Software Engineering Certifications
Boost Your IT Career with IEEE's Software Engineering Certifications Ganesh Samarthyam
 
Essential Test Management and Planning
Essential Test Management and PlanningEssential Test Management and Planning
Essential Test Management and PlanningTechWell
 

Similar to An insight into pull request failures on GitHub (20)

Revisiting Assert Use in GitHub Projects
Revisiting Assert Use in GitHub ProjectsRevisiting Assert Use in GitHub Projects
Revisiting Assert Use in GitHub Projects
 
Model-Based Software Engineering: A Multiple-Case Study on Challenges and Dev...
Model-Based Software Engineering: A Multiple-Case Study on Challenges and Dev...Model-Based Software Engineering: A Multiple-Case Study on Challenges and Dev...
Model-Based Software Engineering: A Multiple-Case Study on Challenges and Dev...
 
Modern practices for the design and
Modern practices for the design andModern practices for the design and
Modern practices for the design and
 
Scientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & SociologyScientific Software: Sustainability, Skills & Sociology
Scientific Software: Sustainability, Skills & Sociology
 
Software requirements engineering
Software requirements engineeringSoftware requirements engineering
Software requirements engineering
 
Requirements engineering tutorial with elicitation, negotiation, analysis and...
Requirements engineering tutorial with elicitation, negotiation, analysis and...Requirements engineering tutorial with elicitation, negotiation, analysis and...
Requirements engineering tutorial with elicitation, negotiation, analysis and...
 
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
 
Slides refsq'14 ds v1
Slides refsq'14 ds v1Slides refsq'14 ds v1
Slides refsq'14 ds v1
 
#NEOTYSPAC performance testing shift left
#NEOTYSPAC performance testing shift left#NEOTYSPAC performance testing shift left
#NEOTYSPAC performance testing shift left
 
PERUMIN 31: Modern Practices for the Design and Planning Underground Mines
PERUMIN 31: Modern Practices for the Design and Planning Underground MinesPERUMIN 31: Modern Practices for the Design and Planning Underground Mines
PERUMIN 31: Modern Practices for the Design and Planning Underground Mines
 
MAANEZ SHAH 5
MAANEZ SHAH 5MAANEZ SHAH 5
MAANEZ SHAH 5
 
DevOps2018 Singapore Eliminating the dev versus ops mentality
DevOps2018 Singapore Eliminating the dev versus ops mentalityDevOps2018 Singapore Eliminating the dev versus ops mentality
DevOps2018 Singapore Eliminating the dev versus ops mentality
 
Overview of Performance Based Design and AIT Experience
Overview of Performance Based Design and AIT ExperienceOverview of Performance Based Design and AIT Experience
Overview of Performance Based Design and AIT Experience
 
Requirements validation techniques (rv ts) practiced in industry studies of...
Requirements validation techniques (rv ts) practiced in industry   studies of...Requirements validation techniques (rv ts) practiced in industry   studies of...
Requirements validation techniques (rv ts) practiced in industry studies of...
 
Rapid Strategic SRE Assessments
Rapid Strategic SRE AssessmentsRapid Strategic SRE Assessments
Rapid Strategic SRE Assessments
 
Technology & Innovation - Barclays
Technology & Innovation - BarclaysTechnology & Innovation - Barclays
Technology & Innovation - Barclays
 
SOCOM is growing SOFWERX
SOCOM is growing SOFWERXSOCOM is growing SOFWERX
SOCOM is growing SOFWERX
 
Who Is A DevOps Engineer? | DevOps Skills You Must Master | DevOps Engineer M...
Who Is A DevOps Engineer? | DevOps Skills You Must Master | DevOps Engineer M...Who Is A DevOps Engineer? | DevOps Skills You Must Master | DevOps Engineer M...
Who Is A DevOps Engineer? | DevOps Skills You Must Master | DevOps Engineer M...
 
Boost Your IT Career with IEEE's Software Engineering Certifications
Boost Your IT Career with IEEE's Software Engineering Certifications Boost Your IT Career with IEEE's Software Engineering Certifications
Boost Your IT Career with IEEE's Software Engineering Certifications
 
Essential Test Management and Planning
Essential Test Management and PlanningEssential Test Management and Planning
Essential Test Management and Planning
 

More from Masud Rahman

HereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie UniversityHereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie UniversityMasud Rahman
 
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...Masud Rahman
 
PhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of SaskatchewanPhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of SaskatchewanMasud Rahman
 
PhD proposal of Masud Rahman
PhD proposal of Masud RahmanPhD proposal of Masud Rahman
PhD proposal of Masud RahmanMasud Rahman
 
PhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud RahmanPhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud RahmanMasud Rahman
 
Doctoral Symposium of Masud Rahman
Doctoral Symposium of Masud RahmanDoctoral Symposium of Masud Rahman
Doctoral Symposium of Masud RahmanMasud Rahman
 
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...Masud Rahman
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationMasud Rahman
 
CodeInsight-SCAM2015
CodeInsight-SCAM2015CodeInsight-SCAM2015
CodeInsight-SCAM2015Masud Rahman
 
RACK-Tool-ICSE2017
RACK-Tool-ICSE2017RACK-Tool-ICSE2017
RACK-Tool-ICSE2017Masud Rahman
 
QUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-SingaporeQUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-SingaporeMasud Rahman
 
CORRECT-ToolDemo-ASE2016
CORRECT-ToolDemo-ASE2016CORRECT-ToolDemo-ASE2016
CORRECT-ToolDemo-ASE2016Masud Rahman
 

More from Masud Rahman (20)

HereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie UniversityHereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie University
 
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
 
PhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of SaskatchewanPhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of Saskatchewan
 
PhD proposal of Masud Rahman
PhD proposal of Masud RahmanPhD proposal of Masud Rahman
PhD proposal of Masud Rahman
 
PhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud RahmanPhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud Rahman
 
Doctoral Symposium of Masud Rahman
Doctoral Symposium of Masud RahmanDoctoral Symposium of Masud Rahman
Doctoral Symposium of Masud Rahman
 
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-Localization
 
MSR2017-Challenge
MSR2017-ChallengeMSR2017-Challenge
MSR2017-Challenge
 
MSR2017-RevHelper
MSR2017-RevHelperMSR2017-RevHelper
MSR2017-RevHelper
 
STRICT-SANER2017
STRICT-SANER2017STRICT-SANER2017
STRICT-SANER2017
 
MSR2015-Challenge
MSR2015-ChallengeMSR2015-Challenge
MSR2015-Challenge
 
CodeInsight-SCAM2015
CodeInsight-SCAM2015CodeInsight-SCAM2015
CodeInsight-SCAM2015
 
STRICT-SANER2015
STRICT-SANER2015STRICT-SANER2015
STRICT-SANER2015
 
CMPT-842-BRACK
CMPT-842-BRACKCMPT-842-BRACK
CMPT-842-BRACK
 
RACK-Tool-ICSE2017
RACK-Tool-ICSE2017RACK-Tool-ICSE2017
RACK-Tool-ICSE2017
 
RACK-SANER2016
RACK-SANER2016RACK-SANER2016
RACK-SANER2016
 
QUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-SingaporeQUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-Singapore
 
CORRECT-ToolDemo-ASE2016
CORRECT-ToolDemo-ASE2016CORRECT-ToolDemo-ASE2016
CORRECT-ToolDemo-ASE2016
 
CORRECT-ICSE2016
CORRECT-ICSE2016CORRECT-ICSE2016
CORRECT-ICSE2016
 

Recently uploaded

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Recently uploaded (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 

An insight into pull request failures on GitHub

  • 1. AN INSIGHT INTO THE PULL REQUESTS OF GITHUB Mohammad Masudur Rahman, Chanchal K. Roy Department of Computer Science University of Saskatchewan 11th Working Conference on Mining Software Repositories(MSR 2014) (Challenge Track) Hyderabad, India
  • 2. RESEARCH PROBLEM: HIGHER RATE OF PULL REQUEST FAILURE IN GITHUB Base repo. Forked repos. Pull Requests • 88 Base repositories • 103,192+ Fork repos. • 20,142 developers • 78,955 Pull requests made in 4+ years. • Only 42.95% pull request commits merged • About 57.05% pull requests failed. RQ: Why and how did those pull requests fail?
  • 3. ASPECTS OF STUDY  Studied and analyzed 7 aspects related to technical problems in the pull requests, programming languages, projects and developers.  Technical issues in pull request commits  Programming language  Application domain  Age of project  Maturity of project  Number of developers  Experience of developers
  • 4. WHICH TECHNICAL PROBLEMS DID HINDER THE SUCCESS OF THE PULL REQUESTS? 2. Recursion & Refactoring (7.57%, 10.78%) 3. Database query execution (6.98%, 9.18%) 16. Arrays & functions (14.40%, 17.29%) 29. Actor model (7.11%, 5.11%) 31. OOP paradigm (7.12%, 9.17%) 33. Space & indentation (3.07%, 7.32%) Arrays & functions (31.69%) Recursion & Refactoring (18.35%) Database query execution (16.16%)
  • 5. DID AN AVERAGE PROJECT FROM DIFFERENT PROGRAMMING LANGUAGES SHOW DIFFERENT BEHAVIOUR IN TERMS OF PULL REQUEST? •Ruby (16.92/m, 40.11/m) •PHP (21.72/m, 21.21/m) •Java (2.75/m, 13.21/m) •Scala (10.39/m, 4.08/m) •C (11.89/m, 6.72/m) •JavaScript (5.92/m, 14.87/m) PHP(42.93/m) Ruby (57.03/m) Java(15.96/m)
  • 6. WAS THERE A DOMAIN-SPECIFIC TREND IN PULL REQUESTS? •Framework (20.67/m, 15.49/m) •IDE (19.43/m, 9.31/m) •Client Apps (10.27/m, 6.37/m) •Database (1.40/m, 3.94/m) •Statistics(1.15/m, 0.80/m) •Library(6.59/m, 9.18/m) IDE (28.84/m) Framework (36.16/m)
  • 7. HOW DID PROJECT AGE AFFECT PULL REQUEST RATE? 2012-2013 (43.12/m) 2009-2010 (19.34/m)
  • 8. HOW DID PROJECT MATURITY AFFECT THE PULL REQUEST RATE? Forks: 3000-6800 (78.63/m) •Forks: 500-1000 (7.43/m, 8.12/m) •Forks: 2000-2500 (25.02/m, 35.29/m) •Forks: 2500-3000 (18.01/m, 23.85/m) •Forks: 3000-6800 (17.61/m, 61.02/m) Forks: 1500-2000 (18/m)
  • 9. HOW DID NO. OF DEVELOPERS OF A PROJECT MATTER IN PULL REQUEST RATE? Developer: 4000-4500 (247.1/m) •#Developers: 400-500 (12.17/m, 15.39/m) •#Developers: 500-1000 (25.14/m, 43.39/m) •#Developers: 1000-2000 (73.23/m, 48.11/m) •#Developers: 4000-4500 (2.23/m, 244.88/m)
  • 10. DID DEVELOPER EXPERIENCE MATTER IN PULL REQUEST RATE? Experience: 50-60 months (390.54/m) •Dev. Experience: 30-40 months (297.56/m, 222.14/m) •Dev. Experience: 50-60 months (55.75/m, 334.79/m) •Dev. Experience: 60-70 months (84.17/m, 65.63/m)
  • 11. TAKE-AWAY MESSAGES  57.05% of the pull requests failed. The issues that failed the requests to merge are related to a limited number of topics—recursion & refactoring, database query execution, arrays & functions and so on.  Projects written in Java, JavaScript and Ruby received exceptionally higher no. of failed pull requests. PHP projects received almost equal no. of successful and unsuccessful pull requests on average per month.  Projects from IDE and Framework domain showed the maximum activities in terms of pull requests.
  • 12. TAKE-AWAY MESSAGES  As the age of a project increases, both merged and failed pull request rates increase almost proportionally.  With the increase in forks, the average no. of pull requests per month did not increase regularly. However, projects with 2000+ forks received increased amount of failed pull requests.  With new participation(developers) in project, no. of pull requests per month did not increase regularly. However, a project with 4000+ developers received excessive no. of failed pull requests.  Projects with developers of 20-50 months experience showed the maximum activities in terms of pull requests.

Editor's Notes

  1. Introduce yourself Today, I am going to talk about our findings from our mining on the Pull requests of Github.
  2. The challenge dataset contains data about 88 base repositories and about 103,192 forked repositories, where 20 thousand developers are involved. Developers usually create forks, and submits their code to the base repository as the pull requests. Statistics show that about 79 thousand pull requests were made from those 88 base projects within a time span of three years. Only 42.95% of the requests were accepted and the commits were successfully merged. The rest 57.05% of the requests were failed, which is a matter of concern. In this research, we investigate why pull requests succeed and fail in GitHub.
  3. We identify an intuitive list of 7 factors, and investigate whether they have any interesting influences on the success or failure of the pull requests.
  4. Question we asked: Which types of technical issues are hindering the developers in getting their pull requests merged? --We collect the pull request commit comments of 9421 pull requests made to 78 base repositories. -We apply LDA topic modeling with Gibbs sampling, retrieve 100 topics, and label 64 topics. -We found 8 dominant topics, and six of them can be labeled, which are shown here. -We found that in the commit discussion of pull requests, certain topics are frequently discussed such as recursion and refactoring, database query execution, arrays and functions and so on. -We even noticed space and indentation is one of the frequently discussed topics.
  5. Question we asked: Does an average project from different programming languages show different behaviour in terms of pull requests? -- We chose 10 programming languages with reasonable number of base projects (maximum 10, minimum 3) --We then find out the number of pull requests made to a base project each month on average. --We found an interesting behaviour in case of different programming language. For example, Ruby projects received the maximum number of pull requests each month, and R and Java projects received the minimum. -- We also note an interesting pattern for PHP projects, its successful and failed pull requests are almost equal, on the other hand, Ruby projects have a relatively higher number of pull requests that failed.
  6. Question we asked: Does the application domain matter in case of the success or failure of the pull requests? --We identify seven major domains consulting the read me description of the projects, and determine the average number of pull requests received per month by a project from each domain. --We found framework and reusable library based projects are dominant in frequency. --We note that framework and IDE based projects received higher rate of pull requests each month than projects of other domains. --For example, IDE based projects received 19.43 successful requests per month, on the other hand database projects received only 2 successful request per month.
  7. We also investigate how age (i.e., how long it is in GitHub) of the base project contributes to its amount of pull requests it receives per month? --We found that the average pull request rates increase regularly over time for a project. --This finding is intuitive, as forks and developer pool increase over time, and thus amount of pull request also grows. --However, the dataset shows that the earliest pull requests started from October 2010, although the project existed from February 2008. --It is notable that both successful and failed pull requests grew over time, that means both developer and the management really need to pay heed to the issue of failed pull requests.
  8. We consider the number of forks of a base project as a heuristic estimate of its maturity. Question we asked: How does the number of forks of a base project contribute to its pull request rate? --We did not find a regular change in pull request rate with the addition of new forks to the base project. --However, projects with more than 2000 forks show higher rate of failed pull requests. --For example, we found 19 base projects with more than 2000 forks, 7 of them have more than 3000 forks, and they show extremely higher unsuccessful pull requests.
  9. It is intuitive that a base project having higher number of developers is likely to receive higher number of pull requests. However, our finding does not support that intuition much. We did not find any regular increase in pull request rate with the increase in the number of developers. Moreover, we note that 8 projects having more than 500 developers received increasing number of unsuccessful pull requests. For example, one project having 4000+ developers received an extreme number of failed pull requests each month.
  10. We consider the working experience of the developers of a project as an important factor that is likely to contribute to the pull request rates. --We average the experience of all the developers of a project, and determine six ranges from 10 months to 70 months. --We note that 53 projects having developers of average experience from 20-50 months received maximum number of pull requests on avearge each month. --However, we note that 10 projects having developers of 60-70 months experience showed relatively lower activities. --More interestingly, 10 projects with 50-60 months developer experience showed extremely high unsuccessful pull request rates.
  11. From the mining, we find the following take-away messages.