SlideShare a Scribd company logo
1 of 12
BIASES
IN MSR
ALEXANDER SEREBRENIK
PROJECTS
PEOPLE
PROBLEMS
GitHub
20%
Eclipse
8%
Closed
source
8%
Apache
7%
Mozilla
5%
Stack
Overflow
4%
Other OSS
49%
Samuel W. Flint, Jigyasa Chauhan, Robert Dyer: Escaping the Time Pit: Pitfalls and Guidelines for Using Time-Based
Git Data. MSR 2021: 85-96
2004-2020
GitHub
20%
Eclipse
8%
Closed
source
8%
Apache
7%
Mozilla
5%
Stack
Overflow
4%
Other
OSS
49%
Samuel W. Flint, Jigyasa Chauhan, Robert Dyer: Escaping the Time Pit: Pitfalls and Guidelines for Using Time-Based
Git Data. MSR 2021: 85-96
GitHub
35%
Eclipse
2%
Closed
source
5%
Apache
2%
Mozilla
0%
Stack
Overflow
19%
Other OSS
21%
Non-SW
16%
2004-2020 2021
https://cyberhoot.com/cybrary/closed-source/
0
45
90
135
180
1-5 11-20 51-100 501-1000
https://www.dutchsoftware.nl/isvs/
Adyen
https://github.com/bacchin/chemical_engineering_python/blob/master/M2_Colloid_DLVO.ipynb
https://www.flickr.com/photos/poughkeepsiedayschool/17892462450
Science Engineering
Science Engineering
Science Engineering
BIASES
IN MSR
@ASEREBRENIK
PROJECTS
PEOPLE
PROBLEMS

More Related Content

Similar to Bias in MSR Research

Semantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer AppsSemantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer AppsJie Bao
 
Structured Data and Schema.org: The Hair Metal Edition
Structured Data and Schema.org: The Hair Metal Edition Structured Data and Schema.org: The Hair Metal Edition
Structured Data and Schema.org: The Hair Metal Edition Matthew Brown
 
2016 - IGNITE - Real Heroes Draw Pictures
2016 - IGNITE - Real Heroes Draw Pictures2016 - IGNITE - Real Heroes Draw Pictures
2016 - IGNITE - Real Heroes Draw Picturesdevopsdaysaustin
 
分享無名小站 API
分享無名小站 API分享無名小站 API
分享無名小站 APIJoseph Chiang
 
Preserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading webPreserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading webMiel Vander Sande
 
Open source-secret-sauce-rit-2010
Open source-secret-sauce-rit-2010Open source-secret-sauce-rit-2010
Open source-secret-sauce-rit-2010Ted Husted
 
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache MesosAugury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache MesosPaco Nathan
 
EclipseCon France 2018 report
EclipseCon France 2018 reportEclipseCon France 2018 report
EclipseCon France 2018 reportAkira Tanaka
 
GitHub Data and Insights
GitHub Data and InsightsGitHub Data and Insights
GitHub Data and InsightsJeff McAffer
 
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)Matthew Russell
 
FOSDEM 2017: Making Your Own Open Source Raspberry Pi HAT
FOSDEM 2017: Making Your Own Open Source Raspberry Pi HATFOSDEM 2017: Making Your Own Open Source Raspberry Pi HAT
FOSDEM 2017: Making Your Own Open Source Raspberry Pi HATLeon Anavi
 
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG🎤 Hanno Embregts 🎸
 
Keypoints html5
Keypoints html5Keypoints html5
Keypoints html5dynamis
 
Hacktoberfest Kolkata 2022.pdf
Hacktoberfest Kolkata 2022.pdfHacktoberfest Kolkata 2022.pdf
Hacktoberfest Kolkata 2022.pdfSawanBhattacharya
 
A "lofiAPI": Using open source applications and simple XML to build a library...
A "lofiAPI": Using open source applications and simple XML to build a library...A "lofiAPI": Using open source applications and simple XML to build a library...
A "lofiAPI": Using open source applications and simple XML to build a library...jason clark
 
CoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiTimothy Spann
 
CoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFissuser73434e
 

Similar to Bias in MSR Research (20)

Semantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer AppsSemantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer Apps
 
Structured Data and Schema.org: The Hair Metal Edition
Structured Data and Schema.org: The Hair Metal Edition Structured Data and Schema.org: The Hair Metal Edition
Structured Data and Schema.org: The Hair Metal Edition
 
2016 - IGNITE - Real Heroes Draw Pictures
2016 - IGNITE - Real Heroes Draw Pictures2016 - IGNITE - Real Heroes Draw Pictures
2016 - IGNITE - Real Heroes Draw Pictures
 
Walter api
Walter apiWalter api
Walter api
 
Open POWER Cores
Open POWER Cores Open POWER Cores
Open POWER Cores
 
分享無名小站 API
分享無名小站 API分享無名小站 API
分享無名小站 API
 
Preserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading webPreserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading web
 
Open source-secret-sauce-rit-2010
Open source-secret-sauce-rit-2010Open source-secret-sauce-rit-2010
Open source-secret-sauce-rit-2010
 
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache MesosAugury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
 
EclipseCon France 2018 report
EclipseCon France 2018 reportEclipseCon France 2018 report
EclipseCon France 2018 report
 
GitHub Data and Insights
GitHub Data and InsightsGitHub Data and Insights
GitHub Data and Insights
 
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
 
FOSDEM 2017: Making Your Own Open Source Raspberry Pi HAT
FOSDEM 2017: Making Your Own Open Source Raspberry Pi HATFOSDEM 2017: Making Your Own Open Source Raspberry Pi HAT
FOSDEM 2017: Making Your Own Open Source Raspberry Pi HAT
 
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
 
Keypoints html5
Keypoints html5Keypoints html5
Keypoints html5
 
Hacktoberfest Kolkata 2022.pdf
Hacktoberfest Kolkata 2022.pdfHacktoberfest Kolkata 2022.pdf
Hacktoberfest Kolkata 2022.pdf
 
A "lofiAPI": Using open source applications and simple XML to build a library...
A "lofiAPI": Using open source applications and simple XML to build a library...A "lofiAPI": Using open source applications and simple XML to build a library...
A "lofiAPI": Using open source applications and simple XML to build a library...
 
CoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFi
 
CoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFi
 
Hacking 101
Hacking 101Hacking 101
Hacking 101
 

More from Alexander Serebrenik

Software development is a human activity: understanding software requires und...
Software development is a human activity: understanding software requires und...Software development is a human activity: understanding software requires und...
Software development is a human activity: understanding software requires und...Alexander Serebrenik
 
Towards Continuous Performance Assessment of Java Applications With PerfBot
Towards Continuous Performance Assessment of Java Applications With PerfBotTowards Continuous Performance Assessment of Java Applications With PerfBot
Towards Continuous Performance Assessment of Java Applications With PerfBotAlexander Serebrenik
 
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...Alexander Serebrenik
 
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...Alexander Serebrenik
 
Emotion Analysis in Software Ecosystems
Emotion Analysis in Software EcosystemsEmotion Analysis in Software Ecosystems
Emotion Analysis in Software EcosystemsAlexander Serebrenik
 
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...Alexander Serebrenik
 
Gender and Age in Software Engineering
Gender and Age in Software EngineeringGender and Age in Software Engineering
Gender and Age in Software EngineeringAlexander Serebrenik
 
Diversity and inclusion in a CS classroom
Diversity and inclusion in a CS classroomDiversity and inclusion in a CS classroom
Diversity and inclusion in a CS classroomAlexander Serebrenik
 
An Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
An Empirical Assessment on Merging and Repositioning of Static Analysis AlarmsAn Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
An Empirical Assessment on Merging and Repositioning of Static Analysis AlarmsAlexander Serebrenik
 
Classification and Ranking of Delta Static Analysis Alarms
Classification and Ranking of Delta Static Analysis AlarmsClassification and Ranking of Delta Static Analysis Alarms
Classification and Ranking of Delta Static Analysis AlarmsAlexander Serebrenik
 
What Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
What Is an AI Engineer? An Empirical Analysis of Job Ads in The NetherlandsWhat Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
What Is an AI Engineer? An Empirical Analysis of Job Ads in The NetherlandsAlexander Serebrenik
 
From team organisation to software quality
From team organisation to software qualityFrom team organisation to software quality
From team organisation to software qualityAlexander Serebrenik
 
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...Alexander Serebrenik
 
My research story (presentation at ICSE 2021 New Faculty Symposium)
My research story (presentation at ICSE 2021 New Faculty Symposium)My research story (presentation at ICSE 2021 New Faculty Symposium)
My research story (presentation at ICSE 2021 New Faculty Symposium)Alexander Serebrenik
 
Opinion Mining for Software Engineering
Opinion Mining for Software EngineeringOpinion Mining for Software Engineering
Opinion Mining for Software EngineeringAlexander Serebrenik
 
Removing Self Admitted Technical Debt
Removing Self Admitted Technical DebtRemoving Self Admitted Technical Debt
Removing Self Admitted Technical DebtAlexander Serebrenik
 
Gender Diversity and Inclusion and Software Engineering
Gender Diversity and Inclusion and Software EngineeringGender Diversity and Inclusion and Software Engineering
Gender Diversity and Inclusion and Software EngineeringAlexander Serebrenik
 
Identifying Developers’ Gender: State of the Art
Identifying Developers’ Gender: State of the ArtIdentifying Developers’ Gender: State of the Art
Identifying Developers’ Gender: State of the ArtAlexander Serebrenik
 

More from Alexander Serebrenik (20)

Software development is a human activity: understanding software requires und...
Software development is a human activity: understanding software requires und...Software development is a human activity: understanding software requires und...
Software development is a human activity: understanding software requires und...
 
Towards Continuous Performance Assessment of Java Applications With PerfBot
Towards Continuous Performance Assessment of Java Applications With PerfBotTowards Continuous Performance Assessment of Java Applications With PerfBot
Towards Continuous Performance Assessment of Java Applications With PerfBot
 
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
 
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
 
Emotion Analysis in Software Ecosystems
Emotion Analysis in Software EcosystemsEmotion Analysis in Software Ecosystems
Emotion Analysis in Software Ecosystems
 
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
 
Gender and Age in Software Engineering
Gender and Age in Software EngineeringGender and Age in Software Engineering
Gender and Age in Software Engineering
 
Alexander - intro
Alexander - introAlexander - intro
Alexander - intro
 
Diversity and inclusion in a CS classroom
Diversity and inclusion in a CS classroomDiversity and inclusion in a CS classroom
Diversity and inclusion in a CS classroom
 
An Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
An Empirical Assessment on Merging and Repositioning of Static Analysis AlarmsAn Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
An Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
 
Classification and Ranking of Delta Static Analysis Alarms
Classification and Ranking of Delta Static Analysis AlarmsClassification and Ranking of Delta Static Analysis Alarms
Classification and Ranking of Delta Static Analysis Alarms
 
What Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
What Is an AI Engineer? An Empirical Analysis of Job Ads in The NetherlandsWhat Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
What Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
 
Gender and Community Smells
Gender and Community SmellsGender and Community Smells
Gender and Community Smells
 
From team organisation to software quality
From team organisation to software qualityFrom team organisation to software quality
From team organisation to software quality
 
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
 
My research story (presentation at ICSE 2021 New Faculty Symposium)
My research story (presentation at ICSE 2021 New Faculty Symposium)My research story (presentation at ICSE 2021 New Faculty Symposium)
My research story (presentation at ICSE 2021 New Faculty Symposium)
 
Opinion Mining for Software Engineering
Opinion Mining for Software EngineeringOpinion Mining for Software Engineering
Opinion Mining for Software Engineering
 
Removing Self Admitted Technical Debt
Removing Self Admitted Technical DebtRemoving Self Admitted Technical Debt
Removing Self Admitted Technical Debt
 
Gender Diversity and Inclusion and Software Engineering
Gender Diversity and Inclusion and Software EngineeringGender Diversity and Inclusion and Software Engineering
Gender Diversity and Inclusion and Software Engineering
 
Identifying Developers’ Gender: State of the Art
Identifying Developers’ Gender: State of the ArtIdentifying Developers’ Gender: State of the Art
Identifying Developers’ Gender: State of the Art
 

Recently uploaded

PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptxCherry
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCherry
 
Terpineol and it's characterization pptx
Terpineol and it's characterization pptxTerpineol and it's characterization pptx
Terpineol and it's characterization pptxMuhammadRazzaq31
 
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneyX-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneySérgio Sacani
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptxArvind Kumar
 
Pteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecyclePteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecycleCherry
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry Areesha Ahmad
 
ONLINE VOTING SYSTEM SE Project for vote
ONLINE VOTING SYSTEM SE Project for voteONLINE VOTING SYSTEM SE Project for vote
ONLINE VOTING SYSTEM SE Project for voteRaunakRastogi4
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusNazaninKarimi6
 
Understanding Partial Differential Equations: Types and Solution Methods
Understanding Partial Differential Equations: Types and Solution MethodsUnderstanding Partial Differential Equations: Types and Solution Methods
Understanding Partial Differential Equations: Types and Solution Methodsimroshankoirala
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
Cot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNACot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNACherry
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
Site specific recombination and transposition.........pdf
Site specific recombination and transposition.........pdfSite specific recombination and transposition.........pdf
Site specific recombination and transposition.........pdfCherry
 
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.Cherry
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Cherry
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptxCherry
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsSérgio Sacani
 

Recently uploaded (20)

PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
Terpineol and it's characterization pptx
Terpineol and it's characterization pptxTerpineol and it's characterization pptx
Terpineol and it's characterization pptx
 
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneyX-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
 
Role of AI in seed science Predictive modelling and Beyond.pptx
Role of AI in seed science  Predictive modelling and  Beyond.pptxRole of AI in seed science  Predictive modelling and  Beyond.pptx
Role of AI in seed science Predictive modelling and Beyond.pptx
 
Pteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecyclePteris : features, anatomy, morphology and lifecycle
Pteris : features, anatomy, morphology and lifecycle
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
 
ONLINE VOTING SYSTEM SE Project for vote
ONLINE VOTING SYSTEM SE Project for voteONLINE VOTING SYSTEM SE Project for vote
ONLINE VOTING SYSTEM SE Project for vote
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Understanding Partial Differential Equations: Types and Solution Methods
Understanding Partial Differential Equations: Types and Solution MethodsUnderstanding Partial Differential Equations: Types and Solution Methods
Understanding Partial Differential Equations: Types and Solution Methods
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Cot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNACot curve, melting temperature, unique and repetitive DNA
Cot curve, melting temperature, unique and repetitive DNA
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Site specific recombination and transposition.........pdf
Site specific recombination and transposition.........pdfSite specific recombination and transposition.........pdf
Site specific recombination and transposition.........pdf
 
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
Genome Projects : Human, Rice,Wheat,E coli and Arabidopsis.
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 

Bias in MSR Research

Editor's Notes

  1. First of all, I would like to thank Shaowei and Bram for inviting me to talk about biases in MSR. Rather then talking about the topics we study as it is customary in scientific presentations, in this talk I would like to reflect on what we as the MSR research community usually do not study, what kind of stories we rarely hear.
  2. There can be many different biases of course, but today I would like to focus on the 3P: Projects we select, People we talk to and Problems we study.
  3. Let us start with discussing the projects we select. The figure comes from the analysis performed by Flint et al. based on the publications for MSR 2004-2020. The green wedge “Closed source” corresponds to merely 8% of the papers. While different reports refer to 89% or even 97% of companies *using* OSS, TechBeacon reports ~35% of the code of an “average” application to be OSS. This suggests that closed source software has traditionally been understudied in MSR. However, the data in the study of Flint et al is historical, so maybe more recent studies are doing it better? https://techbeacon.com/security/state-open-source-commercial-apps-youre-using-more-you-think#:~:text=Open%20source%20code%20comprised%20more,as%20high%20as%2075%20percent.
  4. This is why I have checked the full papers of the 2021 edition of MSR. Closed source projects have still only been considered in 2 papers and in fact both of them used data from the same Dutch company called Adyen! GitHub is even more prominent than before, and additional data sources have been studied such as communication logs or security vulnerabilities.
  5. The problem seems to be that for many of us it is more difficult to get access to closed source software as it requires establishing contacts with companies, while OSS is “just there”, and in particular “just there on GitHub”. Ultimately most company-based studies seem to report data from a limited number of usually large companies such as Microsoft, Google or Adyen.
  6. However, at least in the Netherlands many software development companies are not that big and focussing on large companies leaves out lots of small and medium-size companies. Moreover, not all companies developing software will typically consider themselves as software companies: on the right we see a couple of examples of companies that develop software but are not considered typical software companies - a bank, a company producing lithography machines and a company building tracks. Companies building software are not the same as software development companies.
  7. By the same token persons developing software are not the same as software developers, and while MSR has traditionally focussed on the latter we have mostly overlooked the former. First of all, there are people developing software in a very different context, e.g., computational scientists or kids learning how to program. Moreover, one of the people I have interviewed indicated that they only had a bootcamp training and their university-trained peers made a point that only university-trained people deserve the title of an engineer.
  8. Moreover, even when restricting our attention to software engineers, we tend to bias our results by conducting surveys and interviews solely in English. On the left we see the map of software developers population highlighting the importance of South America and Asia; these are developers we need to reach. On the right we see the map of the English language proficiency. Looking at the two maps calls for multi-lingual surveys, including for example, Spanish, Portuguese, Chinese and Japanese… The only example of a large-scale multi-lingual survey I am aware of is the Pandemic Programming work by Paul Ralph and his co-authors. Moreover, what about developers not active on social media, or from countries we do not see (e.g., from countries under US sanctions that have no access to GitHub)? And, of course, location or language are not the only demographic aspects we need to take into account: age, gender, disability, sexual orientation, socio-economic background, ethnicity influence how people experience software development and we should be more aware of these differences rather than assuming that the opinions of young Caucasian abled straight men from the middle class can be generalised to other developers. Deaggregation is, on the one hand, a must, but on the other hand threatens deanonymisation on small samples.
  9. Finally, let us talk about the problems we study. Since software engineering is an applied discipline, we are supposed to propose solutions that can benefit practitioners and most of our studies refer to “implications for developers”.
  10. However, several researchers have argued that the problems we study are “not real”, that “real” software is much more messy and complex and by studying weird or simplified problems we do not really help the industry. Picture from Lionel’s Facebook.
  11. At the same time, this makes me wonder: if our goal is to help the industry, why should we be doing it on the taxpayers’ money?
  12. To summarise…