This document discusses diversity and inclusion in open source communities. It begins by looking at metrics for measuring diversity, mentioning dimensions like gender, age, location and tenure. It then discusses challenges in identifying subpopulations and collaborations within a population. The document also discusses how diversity alone is not enough, and that inclusion, attraction and retention of diverse groups over time are also important. It provides examples of codes of conduct and how they are used in open source projects to improve inclusion.
Presentation at the Big Software in the Run summer school: Diversity and inclusion in open source software communities. Topics include impact of gender and tenure diversity on productivity, diversity measurement, code of conduct, inclusion.
This is the slideshow for the Growing Your Next Generation of Patrons presentation, by Lexie Robinson and Beth Locy of the Madison Public Library. Presented at the Alabama Library Association Conference, April 18, 2007.
Proactive Displays: Bridging the Gaps between Online Social Networks and Shar...Joe McCarthy
Presentation by Joe McCarthy on February 13, 2008, to the Social Networks class (TCSS 590, http://courses.washington.edu/amtgrade/courses/socialnets/Home.html) at the University of Washington, Tacoma, taught by Ankur Teredesai.
Ambient Informatics in Urban Cafes, a CoCollage presentation at the Digital Cities 6 workshop - "Concepts, Methods and Systems of Urban Informatics" - at the 4th International Conference on Communities & Technologies (C&T 2009). Notes from the workshop can be found here: http://gumption.typepad.com/blog/2009/06/digital-cities-6.html
Situated Community Technology C&T 2009Joe McCarthy
Presentation at a panel on "Community technology to support geographically-based communities" at the 4th International Conference on Communities and Technologies (C&T 2009)
Presentation at the Big Software in the Run summer school: Diversity and inclusion in open source software communities. Topics include impact of gender and tenure diversity on productivity, diversity measurement, code of conduct, inclusion.
This is the slideshow for the Growing Your Next Generation of Patrons presentation, by Lexie Robinson and Beth Locy of the Madison Public Library. Presented at the Alabama Library Association Conference, April 18, 2007.
Proactive Displays: Bridging the Gaps between Online Social Networks and Shar...Joe McCarthy
Presentation by Joe McCarthy on February 13, 2008, to the Social Networks class (TCSS 590, http://courses.washington.edu/amtgrade/courses/socialnets/Home.html) at the University of Washington, Tacoma, taught by Ankur Teredesai.
Ambient Informatics in Urban Cafes, a CoCollage presentation at the Digital Cities 6 workshop - "Concepts, Methods and Systems of Urban Informatics" - at the 4th International Conference on Communities & Technologies (C&T 2009). Notes from the workshop can be found here: http://gumption.typepad.com/blog/2009/06/digital-cities-6.html
Situated Community Technology C&T 2009Joe McCarthy
Presentation at a panel on "Community technology to support geographically-based communities" at the 4th International Conference on Communities and Technologies (C&T 2009)
Who talks to whom? What communication channels do they use and why? What emotions are involved? Summer School on Software Engineering. Oct 9, 2018. Oulu, Finland.
Presented at the Google diversity workshop.
Studying gender diversity in software development teams/communities requires understanding gender of individual developers. In this talk I will provide an overview of different ways of asking developers about their gender as well as inferring gender information from the ways they present themselves and artefacts they create. We conclude by discussing limitations of the inference techniques and surveying concerns related to their application.
Community smells are patterns indicating suboptimal organization and communication of software development teams that have been shown to be related to suboptimal organisation of the source code. Given a long standing association of women and communication mediation, we have conducted a series of studies relating gender diversity to community smells, as well as comparing the results of the data analysis with developers' perception. To get further insights in the relation bwteen gender and community smells, we replicate our study focusing on the Brazilian software teams; indeed, culture-specific expectations on the behavior of people of different genders might have affected the perception of the importance of gender diversity and refactoring strategies when mitigating community smells. Finally, we extend the prediction model by including variables related to national diversity and see how the interplay between national diversity and gender diversity influences presence of community smells.
This talk is based on a series of papers published in 2019-2022 and co-authored with Gemma Catolino, Filomena Ferrucci, Stefano Lambiase, Tiago Massoni, Fabio Palomba, Camila Sarmento, and Damian Andrew Tamburri.
Software engineering is inherently a collaborative venture, involving many stakeholders that coordinate their efforts to produce large software systems. While importance of human aspects in software engineering has been recognised already in the 1970s, emergence of open source software (late 1990s) and platforms such as Stack Overflow and GitHub (late 2000s) enabled application of empirical methods to study of human aspects of software engineering.
In the first part of the talk we present a selection of recent results pertaining to two main
questions: who are the software developers and in what kind of activities they engage. The second part of the talk focuses on tools and techniques that have been used to obtain
the aforementioned results.
An overview of our research on gender (diversity) in GitHub teams. In collaboration with colleagues from TU Eindhoven, CMU, UC Davis, TU Delft, U Salerno, UZH. Presented at the Dutch National Software Engineering Symposium (Amsterdam; Feb 1 2019). Based on papers published/accepted for publication at ICSE 2019, CHI 2015.
Software evolution research is a thriving area of software engineering research. Recent years have seen a growing interest in variety of evolution topics, as witnessed by the growing number of publications dedicated to the subject. Without attempting to be complete, in this talk we provide an overview of emerging trends in software evolution research, such as extension of the traditional boundaries of software, growing attention for social and socio-technical aspects of software development processes, and interdisciplinary research applying research techniques from other research areas to study software evolution, and software evolution research techniques to other research areas. As a large body of software evolution research is empirical in nature, we are confronted by important challenges pertaining to reproducibility of the research, and its generalizability.
Beyond Buzz - Web 2.0 Expo - K.Niederhoffer & M.Smithkategn
A framework to measure a conversation based on approaches from social psychology and sociology. Beyond quantity of buzz, we propose measuring the context of conversation: the signal, person, role, and ecosystem.
Case Study Enspiral Foundation (Enspiral) Enspiral is aMaximaSheffield592
Case Study: Enspiral Foundation (Enspiral)
Enspiral is a New Zealand based social enterprise started in 2012 by Joshua Vial, a computer programmer who was alarmed at the rapid pace of change, especially around social and environmental degradation. Leaving corporate life behind he opened a co-working space in Wellington New Zealand to channel his skills and expertise on “stuff that matters”. Before he knew it computer coders, accountants, lawyers and designers who were also driven to work on things with a social purpose joined him. To his surprise and the surprise of others the space became much more than a shared working space, it became a place where start-up businesses would congregate, sharing their skills, expertise and experience to seed new ideas.
However, there was more that was happening in the space, it wasn’t just a co-working space or an incubator. The power of Enspiral is the community ethos it fostered. Instead of power and influence residing in a select few, people began to help each other by negotiating work in exchange for revenue share arrangements. Enspiral had created a new form of working, where revenues were negotiated, shared, agreed and transparent by all participating parties. This also created a new operating model where power and influence were distributed across the whole network rather than residing in a select few. It meant that everyone involved in the start-up focused on the success of the business as everyone had ‘skin in the game’.
Enspiral support has resulted in the establishment of new start-ups such as:
· Loomio – an online tool for collaborative decision-making
· Bucky Box – an online software to power local food distribution
· Chalkle - reinventing community education
· Volunteer Impact – an software application making it easy for environmental volunteers to track their impact
· LIFEHACK - supports the creation of breakthrough digital solutions for youth wellbeing in New Zealand
· Dev Academy - qualifies digital developers in an intensive 9-week course
· Metric Engine – software application that enables similar organisations to benchmark and compare their performance
Now Enspiral is described as a social business rather than a co-working space. Enspiral promotes itself as a social innovation ecosystem providing “a virtual and physical network of companies and professionals working together to create a thriving society” (Enspiral Network 2014).
The network is not without structure, there are three key sections: (1) The Enspiral Foundation which holds the intellectual property and provides the infrastructure that enables members to communicate and make collective decisions. The Foundation is what holds the people and companies within the network together; (2) Service companies bring together teams of professionals that offer a range of business services under the one roof. These include web design, communication, accounting, legal and financial services; (3) Start-up ventures and potential collabo ...
OpenThreads: The Community of Mailing Lists presented at FOSS4G-NAAlyssa Wright
OpenThreads is a platform for analysis and visualization of mailing lists. The tools included here make it possible to parse the conversations from pipermail and mailman lists into participants, messages, and threads for visualization and analysis. Our goal is to create an open platform that everyone can use to analyze their communities, with a goal of provoking conversation around how open our communities are and how to continue to improve upon the quality and diversity of that openness. See https://github.com/elationfoundation/openThreads for more info!
Does being female make a difference to the way people use software? Can the software industry change the way we do things to make our software more useful for women? Would that be sexist? Would any men want to buy our software afterwards?
These slides can be viewed in tandem with the podcast of a live event at the ESL Educators Conference. (8/10/07) Podcast at http://michaelc.podomatic.com/entry/2007-10-08T07_51_33-07_00
Towards Continuous Performance Assessment of Java Applications With PerfBotAlexander Serebrenik
Bots for continuous performance assessment are gaining use as a productivity tool. We discuss how and why open source projects use them and present an in-depth case study of the Nanosoldier bot used by the team behind the Julia programming language. Based on analysing the history of bot usage and interviews with developers we identify lack of a shared platform for performance measurement as an obstacle to wider adoption of performance measurement bots. To address this, we propose a prototype implementation of such a platform called PerfBot.
Joint work with Florian Markusse and Philipp Leitner, presented at 5th International Workshop on
Bots in Software Engineering, collocated with ICSE 2023, Melbourne Australia.
Who talks to whom? What communication channels do they use and why? What emotions are involved? Summer School on Software Engineering. Oct 9, 2018. Oulu, Finland.
Presented at the Google diversity workshop.
Studying gender diversity in software development teams/communities requires understanding gender of individual developers. In this talk I will provide an overview of different ways of asking developers about their gender as well as inferring gender information from the ways they present themselves and artefacts they create. We conclude by discussing limitations of the inference techniques and surveying concerns related to their application.
Community smells are patterns indicating suboptimal organization and communication of software development teams that have been shown to be related to suboptimal organisation of the source code. Given a long standing association of women and communication mediation, we have conducted a series of studies relating gender diversity to community smells, as well as comparing the results of the data analysis with developers' perception. To get further insights in the relation bwteen gender and community smells, we replicate our study focusing on the Brazilian software teams; indeed, culture-specific expectations on the behavior of people of different genders might have affected the perception of the importance of gender diversity and refactoring strategies when mitigating community smells. Finally, we extend the prediction model by including variables related to national diversity and see how the interplay between national diversity and gender diversity influences presence of community smells.
This talk is based on a series of papers published in 2019-2022 and co-authored with Gemma Catolino, Filomena Ferrucci, Stefano Lambiase, Tiago Massoni, Fabio Palomba, Camila Sarmento, and Damian Andrew Tamburri.
Software engineering is inherently a collaborative venture, involving many stakeholders that coordinate their efforts to produce large software systems. While importance of human aspects in software engineering has been recognised already in the 1970s, emergence of open source software (late 1990s) and platforms such as Stack Overflow and GitHub (late 2000s) enabled application of empirical methods to study of human aspects of software engineering.
In the first part of the talk we present a selection of recent results pertaining to two main
questions: who are the software developers and in what kind of activities they engage. The second part of the talk focuses on tools and techniques that have been used to obtain
the aforementioned results.
An overview of our research on gender (diversity) in GitHub teams. In collaboration with colleagues from TU Eindhoven, CMU, UC Davis, TU Delft, U Salerno, UZH. Presented at the Dutch National Software Engineering Symposium (Amsterdam; Feb 1 2019). Based on papers published/accepted for publication at ICSE 2019, CHI 2015.
Software evolution research is a thriving area of software engineering research. Recent years have seen a growing interest in variety of evolution topics, as witnessed by the growing number of publications dedicated to the subject. Without attempting to be complete, in this talk we provide an overview of emerging trends in software evolution research, such as extension of the traditional boundaries of software, growing attention for social and socio-technical aspects of software development processes, and interdisciplinary research applying research techniques from other research areas to study software evolution, and software evolution research techniques to other research areas. As a large body of software evolution research is empirical in nature, we are confronted by important challenges pertaining to reproducibility of the research, and its generalizability.
Beyond Buzz - Web 2.0 Expo - K.Niederhoffer & M.Smithkategn
A framework to measure a conversation based on approaches from social psychology and sociology. Beyond quantity of buzz, we propose measuring the context of conversation: the signal, person, role, and ecosystem.
Case Study Enspiral Foundation (Enspiral) Enspiral is aMaximaSheffield592
Case Study: Enspiral Foundation (Enspiral)
Enspiral is a New Zealand based social enterprise started in 2012 by Joshua Vial, a computer programmer who was alarmed at the rapid pace of change, especially around social and environmental degradation. Leaving corporate life behind he opened a co-working space in Wellington New Zealand to channel his skills and expertise on “stuff that matters”. Before he knew it computer coders, accountants, lawyers and designers who were also driven to work on things with a social purpose joined him. To his surprise and the surprise of others the space became much more than a shared working space, it became a place where start-up businesses would congregate, sharing their skills, expertise and experience to seed new ideas.
However, there was more that was happening in the space, it wasn’t just a co-working space or an incubator. The power of Enspiral is the community ethos it fostered. Instead of power and influence residing in a select few, people began to help each other by negotiating work in exchange for revenue share arrangements. Enspiral had created a new form of working, where revenues were negotiated, shared, agreed and transparent by all participating parties. This also created a new operating model where power and influence were distributed across the whole network rather than residing in a select few. It meant that everyone involved in the start-up focused on the success of the business as everyone had ‘skin in the game’.
Enspiral support has resulted in the establishment of new start-ups such as:
· Loomio – an online tool for collaborative decision-making
· Bucky Box – an online software to power local food distribution
· Chalkle - reinventing community education
· Volunteer Impact – an software application making it easy for environmental volunteers to track their impact
· LIFEHACK - supports the creation of breakthrough digital solutions for youth wellbeing in New Zealand
· Dev Academy - qualifies digital developers in an intensive 9-week course
· Metric Engine – software application that enables similar organisations to benchmark and compare their performance
Now Enspiral is described as a social business rather than a co-working space. Enspiral promotes itself as a social innovation ecosystem providing “a virtual and physical network of companies and professionals working together to create a thriving society” (Enspiral Network 2014).
The network is not without structure, there are three key sections: (1) The Enspiral Foundation which holds the intellectual property and provides the infrastructure that enables members to communicate and make collective decisions. The Foundation is what holds the people and companies within the network together; (2) Service companies bring together teams of professionals that offer a range of business services under the one roof. These include web design, communication, accounting, legal and financial services; (3) Start-up ventures and potential collabo ...
OpenThreads: The Community of Mailing Lists presented at FOSS4G-NAAlyssa Wright
OpenThreads is a platform for analysis and visualization of mailing lists. The tools included here make it possible to parse the conversations from pipermail and mailman lists into participants, messages, and threads for visualization and analysis. Our goal is to create an open platform that everyone can use to analyze their communities, with a goal of provoking conversation around how open our communities are and how to continue to improve upon the quality and diversity of that openness. See https://github.com/elationfoundation/openThreads for more info!
Does being female make a difference to the way people use software? Can the software industry change the way we do things to make our software more useful for women? Would that be sexist? Would any men want to buy our software afterwards?
These slides can be viewed in tandem with the podcast of a live event at the ESL Educators Conference. (8/10/07) Podcast at http://michaelc.podomatic.com/entry/2007-10-08T07_51_33-07_00
Towards Continuous Performance Assessment of Java Applications With PerfBotAlexander Serebrenik
Bots for continuous performance assessment are gaining use as a productivity tool. We discuss how and why open source projects use them and present an in-depth case study of the Nanosoldier bot used by the team behind the Julia programming language. Based on analysing the history of bot usage and interviews with developers we identify lack of a shared platform for performance measurement as an obstacle to wider adoption of performance measurement bots. To address this, we propose a prototype implementation of such a platform called PerfBot.
Joint work with Florian Markusse and Philipp Leitner, presented at 5th International Workshop on
Bots in Software Engineering, collocated with ICSE 2023, Melbourne Australia.
“STILL AROUND”: Experiences and Survival Strategies of Veteran Women Software...Alexander Serebrenik
The intersection of ageism and sexism can create a hostile environment for veteran software developers belonging to marginalized genders. In this study, we conducted 14 interviews to examine the experiences of people at this intersection, primarily women, in order to discover the strategies they employed in order to successfully remain in the field. We identified 283 codes, which fell into three main categories: Strategies, Experiences, and Perception. Several strategies we identified, such as (Deliberately) Not Trying to Look Younger, were not previously described in the software engineering literature. We found that, in some companies, older women developers are recognized as having particular value, further strengthening the known benefits of diversity in the workforce. Based on the experiences and strategies, we suggest organizations employing software developers to consider the benefits of hiring veteran women software developers. For example, companies can draw upon the life experiences of older women developers in order to better understand the needs of customers from a similar demographic. While we recognize that many of the strategies employed by our study participants are a response to systemic issues, we still consider that, in the short-term, there is benefit in describing these strategies for developers who are experiencing such issues today.
This paper is a joint work with Sterre van Breukelen, Ann Barcomb and Sebastian Baltes
Preprint https://arxiv.org/abs/2302.03723
A Qualitative Study of Developers’ Discussions of Their Problems and Joys Dur...Alexander Serebrenik
Many software developers started to work from home on a short notice during the early periods of COVID-19. A number of previous papers have studied the wellbeing and productivity of software developers during COVID-19. The studies mainly use surveys based on predefined questionnaires. In this paper, we investigate the problems and joys that software developers experienced during the early months of COVID-19 by analyzing their discussions in online forum devRant, where discussions can be open and not bound by predefined survey questionnaires. The devRant platform is designed for developers to share their joys and frustrations of life. We manually analyze 825 devRant posts between January and April 12, 2020 that developers created to discuss their situation during COVID19. WHO declared COVID-19 as pandemic on March 11, 2020. As such, our data offers us insights in the early months of COVID-19. We manually label each post along two dimensions: the topics of the discussion and the expressed sentiment polarity (positive, negative, neutral). We observed 19 topics that we group into six categories: Workplace & Professional aspects, Personal & Family well-being, Technical Aspects, Lockdown preparedness, Financial concerns, and Societal and Educational concerns. Around 49% of the discussions are negative and 26% are positive. We find evidence of developers’ struggles with lack of documentation to work remotely and with their loneliness while working from home. We find stories of their job loss with little or no savings to fallback to. The analysis of developer discussions in the early months of a pandemic will help various stakeholders (e.g., software companies) make important decision early to alleviate developer problems if such a pandemic or similar emergency situation occurs in near future. Software engineering research can make further efforts to develop automated tools for remote work (e.g., automated documentation).
Empirical Software Engineering 27(5): 117 (2022), presented at ICSE 2023 as part of the Journal First program.
Software developers are known to experience a wide range of emotions while performing development tasks. Emotions expressed in developer communication might reflect openness of the ecosystem to newcomers, presence of conflicts, problems in the software development process or source code itself. In this talk, based on a recent work with Nicole Novielli, I present an overview of the state-of-the-art research on analysis of emotions in software engineering focusing on the studies of emotion in context of software ecosystems. To encourage further applications of emotion analysis in the industry and research we also discuss currently available emotion analysis tools and datasets as well as outline directions for future research.
This is a keynote talk given at the 11th International Workshop on Software Engineering for Systems-of-Systems and Software Ecosystems (SESoS 2023), collocated with ICSE 2023 in Melbourne, Australia.
Investigating the Resolution of Vulnerable Dependencies with Dependabot Secur...Alexander Serebrenik
Modern software development practices increasingly rely on third-party libraries due to the inherent benefits of reuse. However, libraries may contain security vulnerabilities that can propagate to the dependent applications. To counter this, maintainers of dependent projects should monitor their dependencies and security reports to ensure that only patched releases of the upstream applications are in use. As manual maintenance of dependencies has shown to be ineffective, several automated tools (aka bots) have been proposed to assist developers in rapidly identifying and resolving vulnerable dependencies.
In this work, we focus on Dependabot, a popular bot providing security and version updates, and study developers' receptivity to its security updates in engineered and actively maintained JavaScript projects. Moreover, we carry out a fine-grained analysis of the lifecycle of every vulnerability to manifest how they are dealt with in the presence of Dependabot.
Our findings show that the task of fixing vulnerable dependencies is, to a large extent, delegated to Dependabot and that developers merge the majority of security updates within several days. On the other hand, when developers do not merge a security update, they usually address the identified vulnerability manually. This approach, however, often takes up to several months which in turn could expose the projects to security issues.
This paper has won the ACM Distinguished paper award at MSR 2023.
An Empirical Assessment on Merging and Repositioning of Static Analysis AlarmsAlexander Serebrenik
Static analysis tools generate a large number of
alarms that require manual inspection. In prior work, repositioning of alarms is proposed to (1) merge multiple similar alarms
together and replace them by a fewer alarms, and (2) report
alarms as close as possible to the causes for their generation. The
premise is that the proposed merging and repositioning of alarms
will reduce the manual inspection effort. To evaluate the premise,
this paper presents an empirical study with 249 developers on
the proposed merging and repositioning of static alarms. The
study is conducted using static analysis alarms generated on C
programs, where the alarms are representative of the merging vs.
non-merging and repositioning vs. non-repositioning situations
in real-life code. Developers were asked to manually inspect and
determine whether assertions added corresponding to alarms in
C code hold. Additionally, two spatial cognitive tests are also
done to determine relationship in performance. The empirical
evaluation results indicate that, in contrast to expectations, there
was no evidence that merging and repositioning of alarms reduces
manual inspection effort or improves the inspection accuracy (at
times a negative impact was found). Results on cognitive abilities
correlated with comprehension and alarm inspection accuracy.
Static analysis tools help to detect common programming errors but generate a large number of false positives.
Moreover, when applied to evolving software systems, around
95% of alarms generated on a version are repeated, i.e., they
have also been generated on the previous version. Version-aware
static analysis techniques (VSATs) have been proposed to suppress
the repeated alarms that are not impacted by the code changes
between the two versions. The alarms reported by VSATs after
the suppression, called delta alarms, still constitute 63% of the
tool-generated alarms.
We observe that delta alarms can be further postprocessed
using their corresponding code changes: the code changes due
to which VSATs identify them as delta alarms. However, none
of the existing VSATs or alarms postprocessing techniques
postprocesses delta alarms using the corresponding code changes.
Based on this observation, we use the code changes to classify
delta alarms into six classes that have different priorities assigned
to them. The assignment of priorities is based on the type of
code changes and their likelihood of actually impacting the delta
alarms. The ranking of alarms, obtained by prioritizing the
classes, can help suppress alarms that are ranked lower, when
resources to inspect all the tool-generated alarms are limited.
We performed an empirical evaluation using 9789 alarms
generated on 59 versions of seven open source C applications.
The evaluation results indicate that the proposed classification
and ranking of delta alarms help to identify, on average, 53% of
delta alarms as more likely to be false positives than the others.
What Is an AI Engineer? An Empirical Analysis of Job Ads in The NetherlandsAlexander Serebrenik
Recently, the job market for Artificial Intelligence (AI) engineers
has exploded. Since the role of AI engineer is relatively new, limited
research has been done on the requirements as set by the industry.
Moreover, the definition of an AI engineer is less established than
for a data scientist or a software engineer. In this study we explore,
based on job ads, the requirements from the job market for the
position of AI engineer in The Netherlands. We retrieved job ad
data between April 2018 and April 2021 from a large job ad database,
Jobfeed from TextKernel. The job ads were selected with a process
similar to the selection of primary studies in a literature review. We
characterize the 367 resulting job ads based on meta-data such as
publication date, industry/sector, educational background and job
titles. To answer our research questions we have further coded 125
job ads manually.
The job tasks of AI engineers are concentrated in five categories:
business understanding, data engineering, modeling, software development and operations engineering. Companies ask for AI engineers with different profiles: 1) data science engineer with focus
on modeling, 2) AI software engineer with focus on software development, 3) generalist AI engineer with focus on both models
and software. Furthermore, we present the tools and technologies
mentioned in the selected job ads, and the soft skills.
Our research helps to understand the expectations companies
have for professionals building AI-enabled systems. Understanding
these expectations is crucial both for prospective AI engineers and
educational institutions in charge of training those prospective
engineers. Our research also helps to better define the profession of
AI engineering. We do this by proposing an extended AI engineering life-cycle that includes a business understanding phase.
Joint work with Marcel Meesters and Petra Heck.
Overview of a series of papers published in 2019-2021 on community smells, and their relation to code smells and gender, as well as resolution strategies.
Women in Dutch Computer Science: Best Practices for Recruitment, Onboarding a...Alexander Serebrenik
Women are underrepresented at all levels in computer science (CS) faculties of Dutch
universities. In this report we focus on experiences related to hiring and promoting women as assistant, associate and full professors (or equivalent at NWO-I CWI).
In 2003 Dave et al. have coined the term “opinion mining” to refer to “processing a set of search results for a given item, generating a list of product attributes (quality, features, etc.) and aggregating opinions about each of them (poor, mixed, good)”. Nine years later, in 2012 Brooks and Swigger have applied sentiment analysis in the context of software engineering. Today another nine years have passed and it is time to look back: what have we achieved as a research community and where should we go next?
To answer this question we conducted a systematic literature review involving 185 papers. Based on the literature review we present 1) well-defined categories of opinion mining-related software development activities, 2) available opinion mining approaches, whether they are evaluated when adopted in other studies, and how their performance is compared, 3) available datasets for performance evaluation and tool customization, and 4) concerns or limitations SE researchers might need to take into account when applying/customizing these opinion mining techniques. The results of our study serve as references to choose suitable opinion mining tools for SE tasks, and provide critical insights for the further development of opinion mining techniques in the SE domain.
This work has been done together with Bin Lin, Gabriele Bavota and Michele Lanza from Università della Svizzera italiana, Switzerland, Nathan Cassee from Eindhoven University of Technology, The Netherlands and Nicole Novielli from University of Bari, Italy.
In this talk I will present results obtained on removing self-admitted technical debt. Self-admitted technical debt is an indication in the source code, usually n the source code comments, that the code is not in the right shape yet. Joint work with Emad Shihab, Everton Maldonado, Rabe Abdelkareem, Fiorella Zampetti, Massimiliano Di Penta and Gianmarco Fucci.
What is social software engineering? How do we collect the data? What kind of data do we collect? How do we analyse it? What challenges are we facing when collecting and analysing social software engineering data?
Peer review is often seen as a cornerstone of modern science. We are going to discuss the current peer review practices in software engineering research, their strengths and limitations. Next we will discuss tips and tricks for writing code reviews, as well as implications for writing papers. I will also share some insights in my own reviewing practices.
My presentation at the Programming Contact Day for the mixed audience of \PhD students from Mathematics and Chemical Technology departments. How can we develop software collaboratively?
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
3. MS Windows post-
release fault
prediction
Precision:
predicted & correct /
predicted
Recall:
predicted & correct /
correct
Code churn
79% 80%
Code complexity
79% 66%
Code coverage 84% 55%
Code dependencies 74% 70%
Organizational
structure
86% 84%
Socio-technical
network
77% 71%
Nagappan, N., Ball, T.: Evidence-based failure prediction.
In: Oram, A., Wilson, G. (eds.) Making Software: What Really Works and Why We Believe it? O’Reilly 2011
4. Fabio Palomba, Damian Andrew Tamburri, Alexander Serebrenik, Andy Zaidman, Francesca Arcelli Fontana, Rocco Oliveto,
How Do Community Smells Influence Code Smells? ICSE (Companion Volume) 2018: 240-241
“Community smells”, problems
in team communication
“Code smells”, problems in
source code
6. I know a lot of female
programmers […] But I don’t recall
ever having one of my questions
answered by, nor have I ever
answered a question by a female
programmer here at StackOverflow.
Why aren’t there more of them
participating, both
with questions and answers?
there is NO appeal for me in
answering questions.[…]
it doesn’t entertain me and I
don’t find much fulfilment in it
12. Update: Multiple Tools
Josh Terrell
et al.
gender
Computer
Bin Lin, Alexander Serebrenik: Recognizing gender of stack overflow users. MSR 2016: 425-429
Adding GitHub helps
Different data sets require different techniques
15. 0
25
50
75
100
Open Source StackOverflow Drupal Commercial
4
37,5
72
86
55,5
95
28
10
75
Women Men Unknown
Bogdan Vasilescu, Andrea Capiluppi, Alexander Serebrenik:
Gender, Representation and Online Participation: A Quantitative Study. Interacting with Computers 26(5): 488-511 (2014)
FLOSS 2003 sample 4144 multiple mailing
lists
NSF survey
“math + CS”
16. Drupal / Wordpress StackOverflow
Duration of
engagement
Comparable
Men engage for
longer
Questions
(relative to
duration of
engagement)
Comparable
Women ask more
questions
Answers
(relative to
duration of
engagement)
Comparable Comparable
Bogdan Vasilescu, Andrea Capiluppi, Alexander Serebrenik:
Gender, Representation and Online Participation: A Quantitative Study. Interacting with Computers 26(5): 488-511 (2014)
17. Women are less effective than men in
competitive environments.
Perform similarly in non-competitive
environments.
Loss of effectiveness stronger when
women compete against men than in
single-sex competitive environments.
Women shy away from competition and
men embrace it
Uri Gneezy, Muriel Niederle, and Aldo Rustichini. Performance in competitive environments: Gender differences.
The Quarterly Journal of Economics, 118(3):1049–1074, 2003.
Muriel Niederle and Lise Vesterlund. Do women shy away from competition? do men compete too much?
The Quarterly Journal of Economics, 122(3):1067–1101, 2007.
18.
19.
20. Jeff Atwood
“putting the information
that works for me, right
personally, like things
I’m interested in but also
at the level when it is
helping other people and
they can contribute”
28. People prefer working with others
similar to them in terms of values,
beliefs, and attitudes [Byrne]
People categorise themselves
into specific groups. Members of
own group are treated better
than outsiders [Tajfel]
Diversity is bad
29. Multicultural social networks
promote creativity
[Harvard Business School]
Diversity is good
Diverse problem solvers
outperform high ability problem
solvers [Hong & Page]
44. Bogdan Vasilescu, Daryl Posnett, Baishakhi Ray, Mark G. J. van den Brand, Alexander Serebrenik, Premkumar T. Devanbu,
Vladimir Filkov: Gender and Tenure Diversity in GitHub Teams. CHI 2015: 3789-3798
45. Bogdan Vasilescu, Daryl Posnett, Baishakhi Ray, Mark G. J. van den Brand, Alexander Serebrenik, Premkumar T. Devanbu,
Vladimir Filkov: Gender and Tenure Diversity in GitHub Teams. CHI 2015: 3789-3798
46. code sees no color or gender
I have used a fake GitHub handle (my
normal GitHub handle is my first name,
which is a distinctly female name) so that
people would assume I was male
interactions are usually
positive too, with occasional
sexism, but nothing more then
one encounters in the rest of life
I'm the only female developer,
as well as the youngest, which
can sometimes be frustrating.
Mostly positive. A few collaborators were difficult to collaborate
with, hard to discern the real cause. Only one or two were
gender related, but one caused me to leave a project.
Bogdan Vasilescu, Vladimir Filkov, Alexander Serebrenik:
Perceptions of Diversity on Git Hub: A User Survey. CHASE@ICSE 2015: 50-56
47. Summary: Studying Diversity
• Population consists of multiple subpopulations
• Challenges:
• Identification of subpopulations
• Identification of collaborations
• Measuring diversity (we have already talked about it)
48. Diversity Dimensions:
Subpopulations
Gender identity First language
Sexual orientation Confidence with English
Age Dis/Ability
Location/Region/Country Neurodiversity
Tenure (project/community) Caregiving (children/elderly)
Socio-economic status
Self-identification as
underrepresented
Race/Ethnicity …
https://github.com/chaoss/wg-diversity-inclusion/blob/master/di_metrics.md
49. Gender identity First language
Sexual orientation Confidence with English
Age Dis/Ability
Location/Region/Country Neurodiversity
Tenure (project/community) Caregiving (children/elderly)
Socio-economic status
Self-identification as
underrepresented
Race/Ethnicity …
https://github.com/chaoss/wg-diversity-inclusion/blob/master/di_metrics.md
Diversity Dimensions:
Subpopulations
50.
51.
52. W. Stevens. Teamworks builds ships. United States shipping board
emergency fleet corporation. Museum Vleeshuis. Antwerp.
53. Collaborations
Bogdan Vasilescu, Alexander Serebrenik, Mathieu Goeminne, Tom Mens: On the variation and specialisation of workload—A
case study of the GNOME ecosystem community. Empirical Software Engineering 19, 4 (2013), 955–1008.
Not only technical
54. Collaborations
Nicole Huesman, Daniel Izquierdo Cortázar, Allison Price. Gender Diversity Analysis in the OpenStack Community. Bitergia.
Work commissioned by Intel Corporation. November 2017
Not only technical Not only repositories
55. Arthurian Romances, French (ca. 1275-1300). Beinecke MS 229, Yale University Library, USA.
Diversity is not
enough!
58. Given the choice, I would never send another patch, bug
report, or suggestion to a Linux kernel mailing list again. My
personal boxes have oopsed with recent kernels, and I ignore
it. My current work on userspace graphics enabling may
require me to send an occasional quirks kernel patch, but I
know I will spend at least a day dreading the potential toxic
background radiation of interacting with the kernel
community before I send anything.
I am no longer a part of the Linux kernel community.
Sarah Sharp, http://sarah.thesharps.us/2015/10/05/closing-a-
door/
59. To answer the obvious "so now that the bug is fixed and the account
is unblocked and Duncan is doing something different you're
coming back, right?": no, that's not why I left. I left because the
response I got to the bug was indicative of a severe problem
with how dispute resolution and handling of this type of issue
works.
I've heard a lot of suggestions from individual commentators which
seem to boil down to "in the future, email Person X or Person Y" but
what I need is the confidence that the system will work not just
for me, who knows some of the R Foundation and Core folks in
a passing way, but for people who don’t.
Olivier Keyes, https://ironholds.org/blog/an-r-update/
Photo by Guillaume Paumier
60.
61.
62. What is a Code of Conduct?
"Principles, values, standards, or rules of
behaviour that guide the decisions,
procedures and systems of an organization
in a way that (a) contributes to the welfare of
its key stakeholders, and (b) respects the
rights of all constituents affected by its
operations.”
International Federation of Accountants, 2007
64. GitHub hits
1
100
10000
Contributor Covenant
Open Code of Conduct
Python
Citizen
Ubuntu
Django
Geek Feminism
7 Common Codes of Conduct have
>500 hits across GitHub Projects
Parastou Tourani, Alexander Serebrenik, Bram Adams: Code of Conduct in Open Source Projects. 24th IEEE International
Conference on Software Analysis, Evolution, and Reengineering, pp. 24-33, 2017
79. Nicolaas Pietersz. Berchem (1620 - 1683). Harvest, KMSKB, Brussel
• People of different genders might have
different needs. Involve them early in your
design.
• Encourage diversity in your teams
• Diversity in ideas and not only in numbers
• More than diversity: inclusion, attraction,
retention