Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Identifying Developers’ Gender: State of the Art

77 views

Published on

Presented at the Google diversity workshop. 

Studying gender diversity in software development teams/communities requires understanding gender of individual developers. In this talk I will provide an overview of different ways of asking developers about their gender as well as inferring gender information from the ways they present themselves and artefacts they create. We conclude by discussing limitations of the inference techniques and surveying concerns related to their application.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Identifying Developers’ Gender: State of the Art

  1. 1. Alexander Serebrenik Eindhoven U of Technology The Netherlands @aserebrenik Identifying Developers’ Gender: State of the Art 1 This talk should be positioned in the broader context of studying gender diversity and its impact on software development process as well as on the outcome of the software.
  2. 2. !2 Careful: gender is a complex social construct. Whatever one-word summary we plan to elicit from developers by whatever means, it will be necessarily an oversimplification of gender identity and gender-related experiences of the individual. Another word of warning is related to me continuously learning about gender and how to talk about gender. I am trying to pay attention to the words I choose but I might make mistakes.
  3. 3. !3 There are two ways of inferring gender of an individual: either by asking them or by using some kind of algorithm based on the artefacts representing the individual (name, profile picture) or created by them.
  4. 4. 4 Whatever technique we use, we should keep in mind that gender is privacy sensitive and should be treated as such. Open source contributors might be hiding their gender on purpose, e.g., many women-developers prefer not to disclose their gender due to safety concerns. Some open source projects do not necessarily want us to know the genders of their members (but some do!) and companies might be sensitive to this topic as well.
  5. 5. !5 So let us start with discussing how we can identify gender when talking to people, i.e., conducting interviews or surveys.
  6. 6. 2004 !6 A highly influential guide on questionnaire design (which I do recommend) published in 2004 recommends the two check boxes. Sex vs gender
  7. 7. 0.7-0.9% !7 However, recent surveys of Stack Overflow and GitHub indicate that 0.7-0.9% of software developers do not identify as either men or women. This might not appear much but
  8. 8. 0.7-0.9% x 2 !8 it is twice as much as in the US population in general!
  9. 9. !9 Slightly better. This approach has been criticised for othering, creating a sense of unease in participants, and will lead to people feeling disrespected: they might not complete surveys and they might let fellow trans people know the research team doesn’t “get it”.
  10. 10. Bauer GR. Making sure everyone counts: considerations for inclusion, identification, and analysis of transgender and transsexual participants in health surveys. In: Coen S, Banister E, editors. What a difference sex and gender make. Vancouver: Institute of Gender and Health, Canadian Institutes of Health Research; 2012. pp. 59–67. !10 A better approach has been advocated by Greta Bauer in 2012. Unfortunately, my favourite survey platform, Google forms, does not seem to support this combination of an open answer in a multiple choice question unless it is the “other”.
  11. 11. Bauer GR. Making sure everyone counts: considerations for inclusion, identification, and analysis of transgender and transsexual participants in health surveys. In: Coen S, Banister E, editors. What a difference sex and gender make. Vancouver: Institute of Gender and Health, Canadian Institutes of Health Research; 2012. pp. 59–67. Bauer GR, Braimoh J, Scheim AI, Dharma C (2017) Transgender-inclusive measures of sex/gender for population surveys: Mixed-methods evaluation and recommendations. PLoS ONE 12(5): e0178043.  0 25 50 75 100 Male Female Other Transfeminine (assigned male at birth, identify as women/non-binary) Transmasculine (assigned female at birth, identify as men/non-binary) !11 However, while this question was clear and easily answered by cisgender participants, it did not clearly identify birth-assigned sex or gender identity. In the interviews this item was cognitively taxing for trans interview participants, who tried to figure out exactly what the researchers were asking, and reached different conclusions.
  12. 12. The GenIUSS Group. Best practices for asking questions to identify transgender and other gender minority respondents on population-based surveys. Herman JL, editor. Los Angeles (CA): The Williams Institute; 2014 p. 1–68. !12 The GenIUSS group has proposed the schema on the slide. Some people would be offended by this phrasing as it implies that trans female are not necessarily female.
  13. 13. Do you consider yourself to be transgender? () Yes () No () Questioning Do you consider yourself to be gender non-conforming, gender diverse, gender variant, or gender expansive? () Yes () No () Questioning Are you intersex? () Yes () No () I don't know Where do you identify on the gender spectrum (check all that apply)? [] Woman [] Demi-girl [] Man [] Demi-boy [] Non-binary [] Demi-non-binary [] Genderqueer [] Genderflux [] Genderfluid [] Demi-fluid [] Demi-gender [] Bigender [] Trigender [] Two-Spirit [] Multigender/polygender [] Pangender/omnigender [] Maxigender [] Aporagender [] Intergender [] Maverique [] Gender confusion/Gender f*ck [] Gender indifferent [] Graygender [] Agender/genderless [] Demi-agender [] Genderless [] Gender neutral [] Neutrois [] Androgynous [] Androgyne [] Prefer not to answer [] Self Identify: _________________ Open demographics https://drnikki.github.io/open-demographics/questions/gender.html !13 Good news: it separates a question about trans* and about gender non-conforming. Even more good news: woman/man instead of female/male; the former puts more stress on identity as opposed to biology. More good news: the question explicitly refers to gender identity and avoids confusion reported for the survey instrument of Bauer. And even more news: “check all that apply”, i.e., someone can be both woman and non-binary. Bad news: there are too many options and we want to keep surveys (and particularly demographic parts of the surveys) short! Even more, some of these notions might be experienced as confusing or taxing. Maverique (pronounced mav-reek) is a specific nonbinary gender identity "characterized by autonomy and inner conviction regarding a sense of self that is entirely independent of male/masculinity, female/femininity or anything which derives from the two while still being neither without gender nor of a neutral gender." Maverique is not close to a female or male gender, and is not like a mix of them; the identity goes beyond the entire scope of the gender binary or any identities within and outside of it. Aporagender (from Greek apo, apor "separate" + "gender") is a nonbinary gender identity and umbrella term for "a gender separate from male, female, and anything in between (unlike Androgyne) while still having a very strong and specific gendered feeling" (that is, not an absence of gender or agender). Neutrois is a non-binary gender identity which is often associated with a "neutral" or "null" gender.
  14. 14. https://www.morgan-klaus.com/sigchi-gender-guidelines!14 A much better solution according to the HCI Guidelines for Gender Equity and Inclusivity is to ask an open ended question. This might be difficult for us as researchers to process (code) but most of software engineering surveys are relatively small, a couple of hundreds of responses.
  15. 15. https://www.morgan-klaus.com/sigchi-gender-guidelines Where do you identify on the gender spectrum (check all that apply)? [] Woman [] Man [] Non-binary [] Prefer not to disclose [] Self Identify: _________________ !15 If you really want to run a huge survey and manual coding of answers is not an option, then the same HCI Guidelines for Gender Equity and Inclusivity recommend the following phrasing. Also here, notice that *multiple* options are possible, the respondents have means not to disclose their gender or to provide their own response.
  16. 16. 10-20% !16 However, whatever survey techniques we use and however we ask the questions, two open problems remain: scale of the data and lack of response. This is a problem if we want to perform a large scale data analysis to tease out minor effects using traditional statistical techniques, since to apply these techniques we need a lot of data to ensure the power of statistical tests. Our recent study that Bogdan is going to talk about tomorrow has involved ~60K individuals. To get this number we will need to survey ~300K-600K developers; if everyone spams 600K respondents the respondents will be even more fed up with us and will not answer our questions…
  17. 17. !17 Enter automatic gender detection mechanisms
  18. 18. Self-presentation + Artefacts created Gender !18 All these tools are based on the main assumption, namely, that gender can be inferred from the way developers present themselves (username, name, avatar) or artefacts they produce (code, comments, etc.)
  19. 19. 19 Basically many of the gender detection techniques look at the names. Many popular names are traditionally associated with a specific gender
  20. 20. https://previews.123rf.com/images/pavalena/pavalena1111/pavalena111100046/11314908-map-kingdom-of-belgium.jpg !20 This practice is well established and in some countries it is even recorded in laws and administrative procedures. This is the case for Belgium, where by law the first name should not be confusing. Most local administrations interpret it as “no girls’ names for boys, no boys’ names for girls”.
  21. 21. Andrea 21 However, the data we analyse comes from a mix of different countries, and certain names are more commonly associated with men in some countries and with women in other countries. Andrea: IT vs DE.
  22. 22. gender Computer !22 This is why, for example, the tool that Bogdan has developed in the past consider location of the developer as the key to interpretation of the gender associated with a particular name. I am using my profile not because I am a paradigmatic developer but because I do not have permission of other GH/SO contributors to use their profiles
  23. 23. gender Computer !23 And of course he has also used heuristics to recognise the location based on zip codes, state abbreviations, top level domains and names of large cities.
  24. 24. Josh Terrell et al. gender Computer Bin Lin, Alexander Serebrenik: Recognizing gender of stack overflow users. MSR 2016: 425-429 Different data sets require different techniques 24 In 2016 we have evaluated several gender detection mechanisms on SO data. The ground truth was obtained by combining information from several surveys conducted earlier. We have considered 5 basic techniques and added GH to check whether additional information helps.
  25. 25. 25 However, location as inferred by genderComputer on its own is not enough. Many of us do not live in countries where we have been born. This person’s name is Andrea and they live in London. What do you think about the gender of this individual based on their name?
  26. 26. 26 NamSor takes surnames into account and hence can help with resolving gender of individuals that no longer live in the countries or origin. Unfortunately, NamSor is a commercial tool using the freemium model. Moreover, NamSor works reasonably well only for “real” names as opposed to display names
  27. 27. Automatically generated 11% No spaces 37% Three or more spaces 1% Two spaces 5% One space 46% !27 But, of course, viability of these approaches would depend on what share of GH developers or SO contributors provide information that can be interpreted as meaningful first and last names. The grey segment indicates percentage of the SO contributors with automatically generated usernames such as user12345. For these contributors no inference technique can be successful; both the red and the blue segments can be analysed by techniques such as genderComputer and NamSor; the red ones only by genderComputer
  28. 28. Lucia Santamaría and Helena Mihaljević (2018), Comparison and benchmark of name-to-gender inference services. PeerJ Comput. Sci. 4:e156; DOI 10.7717/peerj-cs.156 7,076 names 3,811 male, 1,968 female, 1,297 unknown !28 errorCoded: % of individuals coded wrongly or not coded as opposed to the total number of predictions. errorCodedWithoutNA: % of individuals coded wrongly as opposed to the total number of predictions. errorGenderBias: tendency to predict women as men (neg) or men as women (pos).
  29. 29. Lucia Santamaría and Helena Mihaljević (2018), Comparison and benchmark of name-to-gender inference services. PeerJ Comput. Sci. 4:e156; DOI 10.7717/peerj-cs.156 Tends to overpredict women as men Tends to overpredict men as women Tends to overreport unknowns 7,076 names 3,811 male, 1,968 female, 1,297 unknown !29 errorCoded: % of individuals coded wrongly or not coded as opposed to the total number of predictions. errorCodedWithoutNA: % of individuals coded wrongly as opposed to the total number of predictions. errorGenderBias: tendency to predict women as men (neg) or men as women (pos). NamSor seems to do it quite well. Data: different collections of authors of scientific publications (world of science, pubmed, etc)
  30. 30. Lucia Santamaría and Helena Mihaljević (2018), Comparison and benchmark of name-to-gender inference services. PeerJ Comput. Sci. 4:e156; DOI 10.7717/peerj-cs.156!30 Closer inspection reveals a different story, however. Confidence of NamSor drops the we move to Asian names and particularly Easter and South-Eastern Asian names. Half of the East-Asian names have a confidence score of 0!
  31. 31. !31 And this is indeed deeply problematic when trying to apply automatic gender inference techniques to software developers
  32. 32. With special thanks to Huilian Sophie Qiu (CMU) Huilian Sophie Qiu, Alexander Nolte, Anita Brown, Alexander Serebrenik, Bogdan Vasilescu. Going Farther Together: The Impact of Social Capital on Sustained Participation in Open Source 41st International Conference on Software Engineering (ICSE 2019), 2019, pp. 688-699!32 We also extracted features from the name itself, including the last character (e.g., in Spanish, names ending in ‘a’ tend to be female), the last two characters (e.g., in Japan, names ending in ‘ko’ tend to be female), and tri-grams and 4-grams to capture romanized Chinese, Japanese, and Korean names.
  33. 33. With special thanks to Huilian Sophie Qiu (CMU) Huilian Sophie Qiu, Alexander Nolte, Anita Brown, Alexander Serebrenik, Bogdan Vasilescu. Going Farther Together: The Impact of Social Capital on Sustained Participation in Open Source 41st International Conference on Software Engineering (ICSE 2019), 2019, pp. 688-699!33 We have shared our results with NamSor and they plan on improving their accuracy when it comes to CJK names.
  34. 34. https://www.facelytics.io/en/ !34 Another way developers present themselves on social platforms is by using face recognition techniques; here we see that Facelytics has correctly identified my gender. Age-wise it is a bit off, since I am 43.
  35. 35. !35 But things do not always go that smoothly. Daniela Petruzalek, a transgender software developer.
  36. 36. ~30% autogenerated profile images !36 However, not everybody has a meaningful profile picture. For instance, ca. 30% of the Stack Overflow users only have a default profile picture automatically generated based on the MD5 hash of the users’ mail
  37. 37. Bin Lin, Alexander Serebrenik: Recognizing gender of stack overflow users. MSR 2016: 425-429 Age not indicated 15-25 26-31 ≥32 Reputation 1-199 150 50 50 50 Reputation 200-999 150 50 50 50 Reputation ≥1000 150 50 50 50 !37 Moreover not all profile images represent faces (rather than logos or cat pictures). This is why we have carefully selected 900 non-generated profile images and classified them manually. Reputation classes are related to different privileges associated with these classes; age intervals to the general distribution of the ages on SO
  38. 38. Bin Lin, Alexander Serebrenik: Recognizing gender of stack overflow users. MSR 2016: 425-429 53% (479/900) !38
  39. 39. !39 Let us move to the discussion of artefacts created by software developers
  40. 40. Stefan Krüger, Ben Hermann. Can an Online Service Predict Gender? - On the State-of-the-Art in Gender Identification from Texts. Gender Equality Workshop ICSE 2019 !40 When it comes to gender recognition based on the artefacts created most of the approaches consider blog posts and Twitter data.
  41. 41. Stefan Krüger, Ben Hermann. Can an Online Service Predict Gender? - On the State-of-the-Art in Gender Identification from Texts. Gender Equality Workshop ICSE 2019 !41 For example the work of Company & Wanner has been designed in the first place for authorship attribution; similar authorship attribution techniques have been designed for the source code. Is this the way to go? Do different gender code differently?
  42. 42. However… !42
  43. 43. Krüger and Hermann. Text. 62%-93% Accuracy Qiu et al. Names 60%-84% !43 for different datasets for different kinds of names The accuracy of our techniques is not perfect. It can be even lower for some subcommunities, e.g., for Chinese names, when some of the gender-specific information is lost during the romanization.
  44. 44. Bogdan Vasilescu, Vladimir Filkov, Alexander Serebrenik: Perceptions of Diversity on Git Hub: A User Survey. CHASE@ICSE 2015: 50-56 “I have used a fake GitHub handle (my normal GitHub handle is my first name, which is a distinctly female name) so that people would assume I was male” Reliability !44
  45. 45. Krüger and Hermann. Text. 100% Keyes. Face. 92.9-96.7% Stefan Krüger, Ben Hermann. Can an Online Service Predict Gender? - On the State-of-the-Art in Gender Identification from Texts. Gender Equality Workshop ICSE 2019 Os Keyes. The Misgendering Machines: Trans/HCI Implications of Automatic Gender Recognition. CSCW 2018 Santamaría and Mihaljević. Names. 20% Gender binary !45 Most automatic techniques we discuss assume gender binary. These are percentages of papers reviewed in two meta-studies. Keyes: the first number corresponds to the % in papers that introduce automatic gender recognition and the second one - to papers that use automatic gender recognition. The situation with names is a bit better since the tools tend to be probabilistic and at least recognise their own lack of confidence.
  46. 46. !46 We have discussed two large groups of identifying the contributors’ gender: by asking questions and by applying algorithmic tools. None of the techniques is perfect, choice of the technique should of course be done in function of the RQs. However, it might be equally important to discuss the limitations and problems of these techniques (and not only their advantages that made us to choose them).
  47. 47. !47 I would like to conclude this talk by the following calls for action
  48. 48. !48 The two points I would like to make are (1) more focus on gender beyond the binary, which would require rethinking how to approach underrepresented communities, and (2) gender in combination with other diversity attributes (age, culture, …). These narratives are missing. Call for action: in the same way as we have adapted NamSor to include East-Asian names we need to be aware of cultural differences and take them into account when analysing developers’ communities.
  49. 49. First steps!49 The two points I would like to make are (1) more focus on gender beyond the binary, which would require rethinking how to approach underrepresented communities, and (2) gender in combination with other diversity attributes (age, culture, …). These narratives are missing. Call for action: in the same way as we have adapted NamSor to include East-Asian names we need to be aware of cultural differences and take them into account when analysing developers’ communities.
  50. 50. Denae Ford, Reed Milewicz, Alexander Serebrenik. How Remote Work Can Foster a More Inclusive Environment for Transgender Developers Workshop on Gender Equality in Software Engineering, 2019, pp. 9-12 With special thanks to Denae Ford (NCSU) !50 First steps
  51. 51. Control of Identity Disclosure: The desire to be seen as presented Denae Ford, Reed Milewicz, Alexander Serebrenik. How Remote Work Can Foster a More Inclusive Environment for Transgender Developers Workshop on Gender Equality in Software Engineering, 2019, pp. 9-12 “Stack Overflow has constrained expressions of identity. It’s up to you what content you want to fill in. GitHub for a while it was required you expose your email address to the rest of the world.” Petruzalek: The obvious drawback of not being passable is that you become an instant target. So passability is not only an identity goal, its also a mean of self-preservation !51
  52. 52. Economically Stable Work: Distance technical merits from identity Denae Ford, Reed Milewicz, Alexander Serebrenik. How Remote Work Can Foster a More Inclusive Environment for Transgender Developers Workshop on Gender Equality in Software Engineering, 2019, pp. 9-12 !52 A community with 58,481 members hunting for bounties and earning rewards. 30 percent of respondents to the 2015 U.S. Transgender Survey reported being mistreated in the workplace, denied a promotion, or fired because of their gender expression or gender identity. Transgender Americans experience higher levels of unemployment (15% vs 5%), poverty (29% vs 12%) and homelessness (12% vs 0.2%) than their non-transgender peers. http://www.engagetu.com/2018/04/12/economics-and-the-transgender-community/
  53. 53. Economically Stable Work: Distance technical merits from identity Denae Ford, Reed Milewicz, Alexander Serebrenik. How Remote Work Can Foster a More Inclusive Environment for Transgender Developers Workshop on Gender Equality in Software Engineering, 2019, pp. 9-12 You cannot tell from my technical profiles that I’m transgender. I don’t make a big deal that in professional context. It’s just not relevant Ross: “Technology has totally leveled the playing field for someone like me. I can get on the internet and watch tutorials. I have the drive to spend five hours a day to teach myself a skill.” !53
  54. 54. Control of Identity Disclosure: The desire to be seen as presented Economically Stable Work: Distance technical merits from identity Autonomy to Disengage or Reengage Denae Ford, Reed Milewicz, Alexander Serebrenik. How Remote Work Can Foster a More Inclusive Environment for Transgender Developers Workshop on Gender Equality in Software Engineering, 2019, pp. 9-12 !54 And it turns out that these advantages stem from the fact that software developers can work remotely. They can learn remotely, they can work remotely on such platforms as BountySource. In fact, such numbers as 60% of remote workers have been mentioned by software development companies.
  55. 55. Denae Ford, Reed Milewicz, Alexander Serebrenik. How Remote Work Can Foster a More Inclusive Environment for Transgender Developers Workshop on Gender Equality in Software Engineering, 2019, pp. 9-12 We believe that remote work offers a mechanism of control for identity disclosure and empowerment of software developers from any marginalized communities. !55 And it turns out that these advantages stem from the fact that software developers can work remotely. In fact, we believe that remote work offers a mechanism of control for identity disclosure and empowerment of software developers from any marginalized communities.
  56. 56. With special thanks to Margaret Burnett (Oregon State University) !56 And here I would like to compare the discussion of remote work with the wonderful example of Margaret Burnett that shows how technological solutions supporting one community can help many different ones. This is a picture from Amsterdam.
  57. 57. !57 With special thanks to Margaret Burnett (Oregon State University) And here I would like to compare the discussion of remote work with the wonderful example of Margaret Burnett that shows how technological solutions supporting one community can help many different ones.
  58. 58. !58 Summary: to achieve gender equality, diversity and inclusion goals we need to understand the experiences of people of different genders.
  59. 59. !59 Understanding their experiences requires identification of those genders; identification, manual or automatic, of an individual’s gender is a problematic and sensitive subject. All existing solutions have their limitations.
  60. 60. @aserebrenik !60 This being said, the benefits of understanding the experiences of people of different genders is essential and, as the last study conjectures, it can be beneficial not only to one marginalised community but to many of them.

×