SlideShare a Scribd company logo
BIASES
IN MSR
ALEXANDER SEREBRENIK
PROJECTS
PEOPLE
PROBLEMS
GitHub
20%
Eclipse
8%
Closed
source
8%
Apache
7%
Mozilla
5%
Stack
Overflow
4%
Other OSS
49%
Samuel W. Flint, Jigyasa Chauhan, Robert Dyer: Escaping the Time Pit: Pitfalls and Guidelines for Using Time-Based
Git Data. MSR 2021: 85-96
2004-2020
GitHub
20%
Eclipse
8%
Closed
source
8%
Apache
7%
Mozilla
5%
Stack
Overflow
4%
Other
OSS
49%
Samuel W. Flint, Jigyasa Chauhan, Robert Dyer: Escaping the Time Pit: Pitfalls and Guidelines for Using Time-Based
Git Data. MSR 2021: 85-96
GitHub
35%
Eclipse
2%
Closed
source
5%
Apache
2%
Mozilla
0%
Stack
Overflow
19%
Other OSS
21%
Non-SW
16%
2004-2020 2021
https://cyberhoot.com/cybrary/closed-source/
0
45
90
135
180
1-5 11-20 51-100 501-1000
https://www.dutchsoftware.nl/isvs/
Adyen
https://github.com/bacchin/chemical_engineering_python/blob/master/M2_Colloid_DLVO.ipynb
https://www.flickr.com/photos/poughkeepsiedayschool/17892462450
Science Engineering
Science Engineering
Science Engineering
BIASES
IN MSR
@ASEREBRENIK
PROJECTS
PEOPLE
PROBLEMS

More Related Content

Similar to Bias in MSR Research

Semantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer AppsSemantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer Apps
Jie Bao
 
Structured Data and Schema.org: The Hair Metal Edition
Structured Data and Schema.org: The Hair Metal Edition Structured Data and Schema.org: The Hair Metal Edition
Structured Data and Schema.org: The Hair Metal Edition
Matthew Brown
 
2016 - IGNITE - Real Heroes Draw Pictures
2016 - IGNITE - Real Heroes Draw Pictures2016 - IGNITE - Real Heroes Draw Pictures
2016 - IGNITE - Real Heroes Draw Pictures
devopsdaysaustin
 
Open POWER Cores
Open POWER Cores Open POWER Cores
Open POWER Cores
Ganesan Narayanasamy
 
分享無名小站 API
分享無名小站 API分享無名小站 API
分享無名小站 API
Joseph Chiang
 
Preserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading webPreserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading web
Miel Vander Sande
 
Open source-secret-sauce-rit-2010
Open source-secret-sauce-rit-2010Open source-secret-sauce-rit-2010
Open source-secret-sauce-rit-2010
Ted Husted
 
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache MesosAugury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache MesosPaco Nathan
 
EclipseCon France 2018 report
EclipseCon France 2018 reportEclipseCon France 2018 report
EclipseCon France 2018 report
Akira Tanaka
 
GitHub Data and Insights
GitHub Data and InsightsGitHub Data and Insights
GitHub Data and Insights
Jeff McAffer
 
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Matthew Russell
 
FOSDEM 2017: Making Your Own Open Source Raspberry Pi HAT
FOSDEM 2017: Making Your Own Open Source Raspberry Pi HATFOSDEM 2017: Making Your Own Open Source Raspberry Pi HAT
FOSDEM 2017: Making Your Own Open Source Raspberry Pi HAT
Leon Anavi
 
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
🎤 Hanno Embregts 🎸
 
Keypoints html5
Keypoints html5Keypoints html5
Keypoints html5
dynamis
 
Hacktoberfest Kolkata 2022.pdf
Hacktoberfest Kolkata 2022.pdfHacktoberfest Kolkata 2022.pdf
Hacktoberfest Kolkata 2022.pdf
SawanBhattacharya
 
A "lofiAPI": Using open source applications and simple XML to build a library...
A "lofiAPI": Using open source applications and simple XML to build a library...A "lofiAPI": Using open source applications and simple XML to build a library...
A "lofiAPI": Using open source applications and simple XML to build a library...jason clark
 
CoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFi
Timothy Spann
 
CoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFi
ssuser73434e
 
Hacking 101
Hacking 101Hacking 101
Hacking 101
Sudar Muthu
 

Similar to Bias in MSR Research (20)

Semantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer AppsSemantic Web: In Quest for the Next Generation Killer Apps
Semantic Web: In Quest for the Next Generation Killer Apps
 
Structured Data and Schema.org: The Hair Metal Edition
Structured Data and Schema.org: The Hair Metal Edition Structured Data and Schema.org: The Hair Metal Edition
Structured Data and Schema.org: The Hair Metal Edition
 
2016 - IGNITE - Real Heroes Draw Pictures
2016 - IGNITE - Real Heroes Draw Pictures2016 - IGNITE - Real Heroes Draw Pictures
2016 - IGNITE - Real Heroes Draw Pictures
 
Walter api
Walter apiWalter api
Walter api
 
Open POWER Cores
Open POWER Cores Open POWER Cores
Open POWER Cores
 
分享無名小站 API
分享無名小站 API分享無名小站 API
分享無名小站 API
 
Preserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading webPreserving a Web of Linked Data: Lessons and challenges from a fading web
Preserving a Web of Linked Data: Lessons and challenges from a fading web
 
Open source-secret-sauce-rit-2010
Open source-secret-sauce-rit-2010Open source-secret-sauce-rit-2010
Open source-secret-sauce-rit-2010
 
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache MesosAugury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
 
EclipseCon France 2018 report
EclipseCon France 2018 reportEclipseCon France 2018 report
EclipseCon France 2018 report
 
GitHub Data and Insights
GitHub Data and InsightsGitHub Data and Insights
GitHub Data and Insights
 
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
Mining Social Web APIs with IPython Notebook (Data Day Texas 2015)
 
FOSDEM 2017: Making Your Own Open Source Raspberry Pi HAT
FOSDEM 2017: Making Your Own Open Source Raspberry Pi HATFOSDEM 2017: Making Your Own Open Source Raspberry Pi HAT
FOSDEM 2017: Making Your Own Open Source Raspberry Pi HAT
 
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
 
Keypoints html5
Keypoints html5Keypoints html5
Keypoints html5
 
Hacktoberfest Kolkata 2022.pdf
Hacktoberfest Kolkata 2022.pdfHacktoberfest Kolkata 2022.pdf
Hacktoberfest Kolkata 2022.pdf
 
A "lofiAPI": Using open source applications and simple XML to build a library...
A "lofiAPI": Using open source applications and simple XML to build a library...A "lofiAPI": Using open source applications and simple XML to build a library...
A "lofiAPI": Using open source applications and simple XML to build a library...
 
CoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFi
 
CoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFi
 
Hacking 101
Hacking 101Hacking 101
Hacking 101
 

More from Alexander Serebrenik

Software development is a human activity: understanding software requires und...
Software development is a human activity: understanding software requires und...Software development is a human activity: understanding software requires und...
Software development is a human activity: understanding software requires und...
Alexander Serebrenik
 
Towards Continuous Performance Assessment of Java Applications With PerfBot
Towards Continuous Performance Assessment of Java Applications With PerfBotTowards Continuous Performance Assessment of Java Applications With PerfBot
Towards Continuous Performance Assessment of Java Applications With PerfBot
Alexander Serebrenik
 
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
Alexander Serebrenik
 
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
Alexander Serebrenik
 
Emotion Analysis in Software Ecosystems
Emotion Analysis in Software EcosystemsEmotion Analysis in Software Ecosystems
Emotion Analysis in Software Ecosystems
Alexander Serebrenik
 
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Alexander Serebrenik
 
Gender and Age in Software Engineering
Gender and Age in Software EngineeringGender and Age in Software Engineering
Gender and Age in Software Engineering
Alexander Serebrenik
 
Alexander - intro
Alexander - introAlexander - intro
Alexander - intro
Alexander Serebrenik
 
Diversity and inclusion in a CS classroom
Diversity and inclusion in a CS classroomDiversity and inclusion in a CS classroom
Diversity and inclusion in a CS classroom
Alexander Serebrenik
 
An Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
An Empirical Assessment on Merging and Repositioning of Static Analysis AlarmsAn Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
An Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
Alexander Serebrenik
 
Classification and Ranking of Delta Static Analysis Alarms
Classification and Ranking of Delta Static Analysis AlarmsClassification and Ranking of Delta Static Analysis Alarms
Classification and Ranking of Delta Static Analysis Alarms
Alexander Serebrenik
 
What Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
What Is an AI Engineer? An Empirical Analysis of Job Ads in The NetherlandsWhat Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
What Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
Alexander Serebrenik
 
Gender and Community Smells
Gender and Community SmellsGender and Community Smells
Gender and Community Smells
Alexander Serebrenik
 
From team organisation to software quality
From team organisation to software qualityFrom team organisation to software quality
From team organisation to software quality
Alexander Serebrenik
 
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
Alexander Serebrenik
 
My research story (presentation at ICSE 2021 New Faculty Symposium)
My research story (presentation at ICSE 2021 New Faculty Symposium)My research story (presentation at ICSE 2021 New Faculty Symposium)
My research story (presentation at ICSE 2021 New Faculty Symposium)
Alexander Serebrenik
 
Opinion Mining for Software Engineering
Opinion Mining for Software EngineeringOpinion Mining for Software Engineering
Opinion Mining for Software Engineering
Alexander Serebrenik
 
Removing Self Admitted Technical Debt
Removing Self Admitted Technical DebtRemoving Self Admitted Technical Debt
Removing Self Admitted Technical Debt
Alexander Serebrenik
 
Gender Diversity and Inclusion and Software Engineering
Gender Diversity and Inclusion and Software EngineeringGender Diversity and Inclusion and Software Engineering
Gender Diversity and Inclusion and Software Engineering
Alexander Serebrenik
 
Identifying Developers’ Gender: State of the Art
Identifying Developers’ Gender: State of the ArtIdentifying Developers’ Gender: State of the Art
Identifying Developers’ Gender: State of the Art
Alexander Serebrenik
 

More from Alexander Serebrenik (20)

Software development is a human activity: understanding software requires und...
Software development is a human activity: understanding software requires und...Software development is a human activity: understanding software requires und...
Software development is a human activity: understanding software requires und...
 
Towards Continuous Performance Assessment of Java Applications With PerfBot
Towards Continuous Performance Assessment of Java Applications With PerfBotTowards Continuous Performance Assessment of Java Applications With PerfBot
Towards Continuous Performance Assessment of Java Applications With PerfBot
 
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...
 
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...
 
Emotion Analysis in Software Ecosystems
Emotion Analysis in Software EcosystemsEmotion Analysis in Software Ecosystems
Emotion Analysis in Software Ecosystems
 
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...
 
Gender and Age in Software Engineering
Gender and Age in Software EngineeringGender and Age in Software Engineering
Gender and Age in Software Engineering
 
Alexander - intro
Alexander - introAlexander - intro
Alexander - intro
 
Diversity and inclusion in a CS classroom
Diversity and inclusion in a CS classroomDiversity and inclusion in a CS classroom
Diversity and inclusion in a CS classroom
 
An Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
An Empirical Assessment on Merging and Repositioning of Static Analysis AlarmsAn Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
An Empirical Assessment on Merging and Repositioning of Static Analysis Alarms
 
Classification and Ranking of Delta Static Analysis Alarms
Classification and Ranking of Delta Static Analysis AlarmsClassification and Ranking of Delta Static Analysis Alarms
Classification and Ranking of Delta Static Analysis Alarms
 
What Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
What Is an AI Engineer? An Empirical Analysis of Job Ads in The NetherlandsWhat Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
What Is an AI Engineer? An Empirical Analysis of Job Ads in The Netherlands
 
Gender and Community Smells
Gender and Community SmellsGender and Community Smells
Gender and Community Smells
 
From team organisation to software quality
From team organisation to software qualityFrom team organisation to software quality
From team organisation to software quality
 
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...
 
My research story (presentation at ICSE 2021 New Faculty Symposium)
My research story (presentation at ICSE 2021 New Faculty Symposium)My research story (presentation at ICSE 2021 New Faculty Symposium)
My research story (presentation at ICSE 2021 New Faculty Symposium)
 
Opinion Mining for Software Engineering
Opinion Mining for Software EngineeringOpinion Mining for Software Engineering
Opinion Mining for Software Engineering
 
Removing Self Admitted Technical Debt
Removing Self Admitted Technical DebtRemoving Self Admitted Technical Debt
Removing Self Admitted Technical Debt
 
Gender Diversity and Inclusion and Software Engineering
Gender Diversity and Inclusion and Software EngineeringGender Diversity and Inclusion and Software Engineering
Gender Diversity and Inclusion and Software Engineering
 
Identifying Developers’ Gender: State of the Art
Identifying Developers’ Gender: State of the ArtIdentifying Developers’ Gender: State of the Art
Identifying Developers’ Gender: State of the Art
 

Recently uploaded

4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
ssuserbfdca9
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
Sérgio Sacani
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
NathanBaughman3
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
anitaento25
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 

Recently uploaded (20)

4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 

Bias in MSR Research

Editor's Notes

  1. First of all, I would like to thank Shaowei and Bram for inviting me to talk about biases in MSR. Rather then talking about the topics we study as it is customary in scientific presentations, in this talk I would like to reflect on what we as the MSR research community usually do not study, what kind of stories we rarely hear.
  2. There can be many different biases of course, but today I would like to focus on the 3P: Projects we select, People we talk to and Problems we study.
  3. Let us start with discussing the projects we select. The figure comes from the analysis performed by Flint et al. based on the publications for MSR 2004-2020. The green wedge “Closed source” corresponds to merely 8% of the papers. While different reports refer to 89% or even 97% of companies *using* OSS, TechBeacon reports ~35% of the code of an “average” application to be OSS. This suggests that closed source software has traditionally been understudied in MSR. However, the data in the study of Flint et al is historical, so maybe more recent studies are doing it better? https://techbeacon.com/security/state-open-source-commercial-apps-youre-using-more-you-think#:~:text=Open%20source%20code%20comprised%20more,as%20high%20as%2075%20percent.
  4. This is why I have checked the full papers of the 2021 edition of MSR. Closed source projects have still only been considered in 2 papers and in fact both of them used data from the same Dutch company called Adyen! GitHub is even more prominent than before, and additional data sources have been studied such as communication logs or security vulnerabilities.
  5. The problem seems to be that for many of us it is more difficult to get access to closed source software as it requires establishing contacts with companies, while OSS is “just there”, and in particular “just there on GitHub”. Ultimately most company-based studies seem to report data from a limited number of usually large companies such as Microsoft, Google or Adyen.
  6. However, at least in the Netherlands many software development companies are not that big and focussing on large companies leaves out lots of small and medium-size companies. Moreover, not all companies developing software will typically consider themselves as software companies: on the right we see a couple of examples of companies that develop software but are not considered typical software companies - a bank, a company producing lithography machines and a company building tracks. Companies building software are not the same as software development companies.
  7. By the same token persons developing software are not the same as software developers, and while MSR has traditionally focussed on the latter we have mostly overlooked the former. First of all, there are people developing software in a very different context, e.g., computational scientists or kids learning how to program. Moreover, one of the people I have interviewed indicated that they only had a bootcamp training and their university-trained peers made a point that only university-trained people deserve the title of an engineer.
  8. Moreover, even when restricting our attention to software engineers, we tend to bias our results by conducting surveys and interviews solely in English. On the left we see the map of software developers population highlighting the importance of South America and Asia; these are developers we need to reach. On the right we see the map of the English language proficiency. Looking at the two maps calls for multi-lingual surveys, including for example, Spanish, Portuguese, Chinese and Japanese… The only example of a large-scale multi-lingual survey I am aware of is the Pandemic Programming work by Paul Ralph and his co-authors. Moreover, what about developers not active on social media, or from countries we do not see (e.g., from countries under US sanctions that have no access to GitHub)? And, of course, location or language are not the only demographic aspects we need to take into account: age, gender, disability, sexual orientation, socio-economic background, ethnicity influence how people experience software development and we should be more aware of these differences rather than assuming that the opinions of young Caucasian abled straight men from the middle class can be generalised to other developers. Deaggregation is, on the one hand, a must, but on the other hand threatens deanonymisation on small samples.
  9. Finally, let us talk about the problems we study. Since software engineering is an applied discipline, we are supposed to propose solutions that can benefit practitioners and most of our studies refer to “implications for developers”.
  10. However, several researchers have argued that the problems we study are “not real”, that “real” software is much more messy and complex and by studying weird or simplified problems we do not really help the industry. Picture from Lionel’s Facebook.
  11. At the same time, this makes me wonder: if our goal is to help the industry, why should we be doing it on the taxpayers’ money?
  12. To summarise…