This document summarizes a quantitative study on gender representation and online participation. The study analyzed data from StackOverflow, WordPress, and Drupal to investigate gender ratios, levels of engagement, and other metrics. It found that women comprised 7-10% of participants compared to 1-5% for open source communities generally. While women asked more questions, there were no significant differences found in other engagement metrics between genders. The study hypothesized that competitive elements and anonymity may discourage some women from greater participation.
Software evolution research is a thriving area of software engineering research. Recent years have seen a growing interest in variety of evolution topics, as witnessed by the growing number of publications dedicated to the subject. Without attempting to be complete, in this talk we provide an overview of emerging trends in software evolution research, such as extension of the traditional boundaries of software, growing attention for social and socio-technical aspects of software development processes, and interdisciplinary research applying research techniques from other research areas to study software evolution, and software evolution research techniques to other research areas. As a large body of software evolution research is empirical in nature, we are confronted by important challenges pertaining to reproducibility of the research, and its generalizability.
Software engineering is inherently a collaborative venture, involving many stakeholders that coordinate their efforts to produce large software systems. While importance of human aspects in software engineering has been recognised already in the 1970s, emergence of open source software (late 1990s) and platforms such as Stack Overflow and GitHub (late 2000s) enabled application of empirical methods to study of human aspects of software engineering.
In the first part of the talk we present a selection of recent results pertaining to two main
questions: who are the software developers and in what kind of activities they engage. The second part of the talk focuses on tools and techniques that have been used to obtain
the aforementioned results.
Assessing the available and accessible evidence: How personal reputations are...Frances Ryan
Slides for the conference paper 'Assessing the available and accessible evidence: How personal reputations are determined and managed online' presented at Information: interactions and impact 2015, Aberdeen, 23-26 June 2015.Abstract available at http://www.iidi.napier.ac.uk/c/publications/publicationid/13382473
Software evolution research is a thriving area of software engineering research. Recent years have seen a growing interest in variety of evolution topics, as witnessed by the growing number of publications dedicated to the subject. Without attempting to be complete, in this talk we provide an overview of emerging trends in software evolution research, such as extension of the traditional boundaries of software, growing attention for social and socio-technical aspects of software development processes, and interdisciplinary research applying research techniques from other research areas to study software evolution, and software evolution research techniques to other research areas. As a large body of software evolution research is empirical in nature, we are confronted by important challenges pertaining to reproducibility of the research, and its generalizability.
Software engineering is inherently a collaborative venture, involving many stakeholders that coordinate their efforts to produce large software systems. While importance of human aspects in software engineering has been recognised already in the 1970s, emergence of open source software (late 1990s) and platforms such as Stack Overflow and GitHub (late 2000s) enabled application of empirical methods to study of human aspects of software engineering.
In the first part of the talk we present a selection of recent results pertaining to two main
questions: who are the software developers and in what kind of activities they engage. The second part of the talk focuses on tools and techniques that have been used to obtain
the aforementioned results.
Assessing the available and accessible evidence: How personal reputations are...Frances Ryan
Slides for the conference paper 'Assessing the available and accessible evidence: How personal reputations are determined and managed online' presented at Information: interactions and impact 2015, Aberdeen, 23-26 June 2015.Abstract available at http://www.iidi.napier.ac.uk/c/publications/publicationid/13382473
From ICT to Computing. Presentation for the inaugral meeting of the Calderdal...Pete Bell
@petejbell
Getting the balance right:
Computing > Computer Science > Programming
Don't reinvent the wheel:
What you may already be doing in ICT that fits the new Computing Programme of Study
Assessment
Progression pathways
Warning that classifying people into too many levels = lower accuracy of assessment
SOLO
Resources
Algorithms Unplugged!
Sharing Practice
Lecture to SIPA students on basics of creating data visualisations in multi-language, very-diverse-datasets developing-world / emerging-economy environments.
NL-Graphs: A Hybrid Approach toward Interactively Querying Semantic DataSuvodeep Mazumdar
A variety of query approaches have been proposed by the semantic web community to explore and query semantic data. Each was developed for a specific task and employed its own interaction mechanism; each query mechanism has its own set of advantages and drawbacks. Most semantic web search systems employ only one approach, thus being unable to exploit the benefits of alternative approaches. Motivated by a usability and interactivity perspective, we propose to combine two query approaches (graph-based and natural language) as a hybrid query approach. In this paper, we present NL-Graphs which aims to exploit the strengths of both approaches, while ameliorating their weaknesses. NL-Graphs was conceptualised and developed from observations, and lessons learned, in several evaluations with expert and casual users. The results of evaluating our approach with expert and casual users on a large semantic dataset are very encouraging; both types of users were highly satisfied and could effortlessly use the hybrid approach to formulate and answer queries. Indeed, success rates showed they were able to successfully answer all the evaluation questions.
Lies, Damned Lies and Software Analytics: Why Big Data Needs Rich DataMargaret-Anne Storey
(Abstract and video links below)
ACM SIGSOFT Webinar May 4th, 2016
Distinguished lecture at ISR, UCI, April 2016.
UCI Video is available at: https://www.youtube.com/watch?v=Ujm4G7ayRQQ
Webinar link will be available shortly.
This talk is based on a short chapter to appear in a forthcoming book on "Perspectives on Data Science for Software Engineering", it can be preordered here:
http://goo.gl/Wi30Ra
Abstract:
Software analytics and the use of computational methods on "big" data in software engineering is transforming the ways software is developed, used, improved and deployed. Software engineering researchers and practitioners are witnessing an increasing trend in the availability of diverse trace and operational data and the methods to analyze it. This information is being used to paint a picture of how software is engineered and suggest ways it may be improved. But we have to remember that software engineering is inherently a socio-technical endeavour, with complex practices, activities and cultural aspects that cannot be externalized or captured by tools alone---in fact, they may be perturbed when trace data is surfaced and analyzed in a transparent manner.
In this talk, I will ask:
- Are researchers and practitioners adequately considering the unanticipated impacts that software analytics can have on software engineering processes and stakeholders?
- Are there important questions that are not being asked because the answers do not lie in the data that are readily available?
- Can we improve the application of software analytics using other methods that collect insights directly from participants in software engineering (e.g., through observations)?
I will explore these questions through specific examples. I hope to engage the audience in discussing how software analytics that depend on "big data" from tools, as well as methods that collect "thick" data from participants, can be mutually beneficial in improving software engineering research and practice.
Wimmics Research Team 2015 Activity ReportFabien Gandon
Extract of the activity report of the Wimmics joint research team between Inria Sophia Antipolis - Méditerranée and I3S (CNRS and Université Nice Sophia Antipolis). Wimmics stands for web-instrumented man-machine interactions, communities and semantics. The team focuses on bridging social semantics and formal semantics on the web.
Keynote at ICSME 2017, Shanghai, China.
Title: The Elusive Nature of Software Documentation and Why Understanding How Knowledge Flows Matters
Abstract: Many developers consider writing documentation to be a painful and under-appreciated activity, yet the same developers often complain that a lack of documentation significantly hampers their work. Other developers argue that documentation is passé as developers more readily curate and exchange knowledge through networked platforms such as Slack, Twitter, and Stack Overflow. And while the savvy modern developer will know who to follow, who to ask, and where to look when they need software knowledge, finding the right knowledge at the right time remains a serious development bottleneck for many. Recognizing that these platforms contain golden nuggets of useful information, we see tremendous effort being directed at designing methods for capturing, mining, extracting, and distributing software knowledge, but will they succeed if we lack a good understanding of how knowledge flows in software development projects and communities? Through this talk, I will discuss the elusive nature of documentation and why I believe documentation will always be hard to define, capture, distribute, keep up to date, and to find, and I will argue that we should focus more on understanding, supporting, and amplifying knowledge flow in distributed software development.
From ICT to Computing. Presentation for the inaugral meeting of the Calderdal...Pete Bell
@petejbell
Getting the balance right:
Computing > Computer Science > Programming
Don't reinvent the wheel:
What you may already be doing in ICT that fits the new Computing Programme of Study
Assessment
Progression pathways
Warning that classifying people into too many levels = lower accuracy of assessment
SOLO
Resources
Algorithms Unplugged!
Sharing Practice
Lecture to SIPA students on basics of creating data visualisations in multi-language, very-diverse-datasets developing-world / emerging-economy environments.
NL-Graphs: A Hybrid Approach toward Interactively Querying Semantic DataSuvodeep Mazumdar
A variety of query approaches have been proposed by the semantic web community to explore and query semantic data. Each was developed for a specific task and employed its own interaction mechanism; each query mechanism has its own set of advantages and drawbacks. Most semantic web search systems employ only one approach, thus being unable to exploit the benefits of alternative approaches. Motivated by a usability and interactivity perspective, we propose to combine two query approaches (graph-based and natural language) as a hybrid query approach. In this paper, we present NL-Graphs which aims to exploit the strengths of both approaches, while ameliorating their weaknesses. NL-Graphs was conceptualised and developed from observations, and lessons learned, in several evaluations with expert and casual users. The results of evaluating our approach with expert and casual users on a large semantic dataset are very encouraging; both types of users were highly satisfied and could effortlessly use the hybrid approach to formulate and answer queries. Indeed, success rates showed they were able to successfully answer all the evaluation questions.
Lies, Damned Lies and Software Analytics: Why Big Data Needs Rich DataMargaret-Anne Storey
(Abstract and video links below)
ACM SIGSOFT Webinar May 4th, 2016
Distinguished lecture at ISR, UCI, April 2016.
UCI Video is available at: https://www.youtube.com/watch?v=Ujm4G7ayRQQ
Webinar link will be available shortly.
This talk is based on a short chapter to appear in a forthcoming book on "Perspectives on Data Science for Software Engineering", it can be preordered here:
http://goo.gl/Wi30Ra
Abstract:
Software analytics and the use of computational methods on "big" data in software engineering is transforming the ways software is developed, used, improved and deployed. Software engineering researchers and practitioners are witnessing an increasing trend in the availability of diverse trace and operational data and the methods to analyze it. This information is being used to paint a picture of how software is engineered and suggest ways it may be improved. But we have to remember that software engineering is inherently a socio-technical endeavour, with complex practices, activities and cultural aspects that cannot be externalized or captured by tools alone---in fact, they may be perturbed when trace data is surfaced and analyzed in a transparent manner.
In this talk, I will ask:
- Are researchers and practitioners adequately considering the unanticipated impacts that software analytics can have on software engineering processes and stakeholders?
- Are there important questions that are not being asked because the answers do not lie in the data that are readily available?
- Can we improve the application of software analytics using other methods that collect insights directly from participants in software engineering (e.g., through observations)?
I will explore these questions through specific examples. I hope to engage the audience in discussing how software analytics that depend on "big data" from tools, as well as methods that collect "thick" data from participants, can be mutually beneficial in improving software engineering research and practice.
Wimmics Research Team 2015 Activity ReportFabien Gandon
Extract of the activity report of the Wimmics joint research team between Inria Sophia Antipolis - Méditerranée and I3S (CNRS and Université Nice Sophia Antipolis). Wimmics stands for web-instrumented man-machine interactions, communities and semantics. The team focuses on bridging social semantics and formal semantics on the web.
Keynote at ICSME 2017, Shanghai, China.
Title: The Elusive Nature of Software Documentation and Why Understanding How Knowledge Flows Matters
Abstract: Many developers consider writing documentation to be a painful and under-appreciated activity, yet the same developers often complain that a lack of documentation significantly hampers their work. Other developers argue that documentation is passé as developers more readily curate and exchange knowledge through networked platforms such as Slack, Twitter, and Stack Overflow. And while the savvy modern developer will know who to follow, who to ask, and where to look when they need software knowledge, finding the right knowledge at the right time remains a serious development bottleneck for many. Recognizing that these platforms contain golden nuggets of useful information, we see tremendous effort being directed at designing methods for capturing, mining, extracting, and distributing software knowledge, but will they succeed if we lack a good understanding of how knowledge flows in software development projects and communities? Through this talk, I will discuss the elusive nature of documentation and why I believe documentation will always be hard to define, capture, distribute, keep up to date, and to find, and I will argue that we should focus more on understanding, supporting, and amplifying knowledge flow in distributed software development.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Gender, Representation and Online Participation: a Quantitative Study
1. Gender, Representation and Online
Participation:
a Quantitative Study
Dr Andrea Capiluppi
30 Oct 2013
Dept of Information Systems and Computing (DISC)
2. My research background
• Software engineering
–
–
–
–
Software maintenance & evolution
Software architectures, components & reuse
Effort estimation
Quantitative studies
• Open processes
– Open source products
– Social networks
• Wikipedia
• Q&A sites
3. The Fastest Q&A Site in the West
• StackOverflow is a “Question & Answer site for
programmers”
– Part of the StackExchange network
• Most questions are answered
– StackOverflow (92.6%)
– Yahoo! Answers (88.2%)
– KiN (~66%)
• Median answer time of only 11 minutes!
Mamykina, L., Manoim, B., Mittal, M., Hripcsak, G., & Hartmann, B. (2011, May).
Design lessons from the fastest q&a site in the west. In Proceedings of the SIGCHI
conference on Human factors in computing systems (pp. 2857-2866). ACM.
4. Game Mechanisms in SO
• SO is based on points
– Reputation points
• Good answer
• Good comment
• Good question
• ...
– Badges
• Popular Question
• Commentator
• Necromancer
• …
– Privileges: more points give access to more features
• Voting
• Commenting
• Editing
5. How this work started
• Major conference, paper painting the awesomeness
of StackOverflow
Lotufo, R., Passos, L., & Czarnecki, K.
(2012, June). Towards improving bug
tracking systems with game mechanisms.
In Mining Software Repositories (MSR),
2012 9th IEEE Working Conference on
(pp. 2-11). IEEE.
6. How this work started
• Paper was well received
• Questions from the audience:
– is SO attracting a male-only crowd?
• Wider questions:
– Are prizes, badges, reputation creating an unbalanced
participation?
– Is “gaming” lethal for a social network? Making it less
sustainable?
8. A bit of a touchy topic...
Regarding the FLOSS community as a
whole, have you ever observed
discriminatory behaviour against women?
FLOSSPOLS
Deliverable D16
Gender: Integrated
Report of Findings.
http://www.flosspols.o
rg/deliverables/D16H
TML/FLOSSPOLSD16Gender_Integrated_R
eport_of_Findings.ht
m, 2006.
9. Demoted skills
• Online status and reputation: 'pro' and 'rookie'
– Technical skills: coding, debugging, etc.
– Non-technical skills: usability, web design, etc.
• (…) the skill of web design was demoted to a ‘nontechnical’ status as it became a way in which women
described and approached their work [Kotamraju
2003]
Kotamraju, N. 2003. Art versus Codep: The Gendered
Evolution of Web Design df Skills. In Howard, P. and S. Jones
(eds) Society Online: The Internet in Context. London: Sage.
11. Aim of the study
• Provide quantifiable evidence of gender
participation and engagement
– Is gender ratio unbalanced?
– Is gender engagement unbalanced?
• Data sampling: Q&A sites
– StackOverflow
– Wordpress
– Drupal
13. Research questions:
• RQ1: What are the challenges with identifying gender
in online communities?
• RQ2: What is the rate of participation by women in
online communities?
• RQ3: What is the level of engagement by women in
online communities?
… (trying to) avoid moralistic messages
16. Empirical approach
• Data mining/Name extraction
• Gender resolution
• Detection of activity on
– StackOverflow
– Drupal
– WordPress
• Statistical comparison between gender
17. Data and name extraction
• StackOverflow public data dump
– 1,078,708 registered users
– Too much noise to automatically assign gender
– Random sampling
• 2% margin error
• 99% confidence interval
• Subset of 4,144 SO users
• Manual gender resolution
18. Data and name extraction II
• Drupal and WordPress
mailing lists
– Both separate Q&A into
various sub-lists
• Consulting
• Development
• Support
• …
– Name, Surname, email
address, text of email,
<<in_response_to>> tag
– All messages & authors
analysed
– Manual gender resolution
24. 14/11/13
P
A
S
G
E
E
T
24
W
&
Heuristics:
title + first h1
<title>Ben Kamens</title>
…
<h1>We’re willing
to be embarrassed about
what we
<em>haven’t</em>
done…</h1>
Ben Kamens We’re willing to
be embarrassed about what we
haven’t done…
Stanford Named
Entity Tagger
<PERSON>Ben
Kamens</PERSON> We’re
willing to be embarrassed
about what we haven’t done…
26. 14/11/13
P
A
S
Quality of gender resolution: Survey
G
E
E
T
26
W
SelfAs inferred Total
&
identification
M
M
F
F ?
60
2
3 43
5 4
+ avatars,
other social
media sites
(manually)
106
11
SelfAs inferred Total
identification M F ?
M
F
90
2
3 13
9 0
106
11
34. 14/11/13
P
A
S
G
E
E
T
34
W • [Gneezy,
&
Why?
Niederle, Rustichini 2003]: women are less
effective in mixed-gender competitive environments
• [Niederle, Vesterlund 2007]: women shy away from
competition and men embrace it
• To retain women we need different gamification
techniques
35. 14/11/13
P
A
S
Threats to validity
G
E
E
T
35
• Gender inference:
W
&
• Automated: Imprecise
tooling
• Manual: Errare humanum est
• Gender swapping
• Images of other people as avatars
• Celebrities, children, porn stars…
38. Questions?
Vasilescu, B., Capiluppi, A., Serebrenik A.
(2012): Gender, Representation and Online
Participation: A Quantitative Study of
StackOverflow Social Informatics
(SocialInformatics), 2012 International
Conference on, p. 332-338
●
Vasilescu, B., Capiluppi, A., Serebrenik A.
(2013): Men at work: the StackOverflow case Tiny
Transactions on Computer Science, 2
●
Vasilescu, B., Capiluppi, A., Serebrenik A.
(2013): Gender, Representation and Online
Participation: A Quantitative Study, Interacting
with Computers 2013; doi: 10.1093/iwc/iwt047
●
Editor's Notes
Advantages: controlled sample
Disadvantages: representative?
In any case: direction for future work
<number>
However, what is common to both Drupal and
WordPress is that the dierences in gender participation
occur mostly between mailing lists focussing on designing
technology (development, wp-hackers and wp-xmlrc)
and using technology (consulting, wp-docs and wp-edu).
<number>