Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Measuring Gender Inequalities of German Professions on Wikipedia

132 views

Published on

Presentation for Master Thesis defense , University of Koblenz-Landau, GESIS – Leibniz Institute for the Social Sciences

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Measuring Gender Inequalities of German Professions on Wikipedia

  1. 1. SLA 2014/15 Zagovora Olga 1Institute for Web Science and Technologies · University of Koblenz-Landau, Germany Measuring gender inequalities of German professions on Wikipedia Olga Zagovora Supervisors: Prof. Dr. Claudia Wagner Dr. Fabian Flöck
  2. 2. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 2 Gender stereotypes #RedrawTheBalance www.inspiringthefuture.org Watch video from: https://www.youtube.com/watch?v=kJP1zPOfq_0
  3. 3. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 3 Profession article. Example: Images: http:// de.wikipedia.org
  4. 4. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 4 Profession article. Example: Images: http:// de.wikipedia.org
  5. 5. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 5 1. Redirection analysis • Classification of professions according to existing Wiki articles [based on gender of profession title Example: “Hebamme” or “Entbindungspfleger” “Kaufmann”, “Kauffrau”, or “Kaufleute”] 2. Images analysis -> People on images • Identification of people gender on images • Distribution comparison of image categories [based on persons’ gender] 3. Textual analysis -> Mentioned people in the text • Mining of persons names from articles • Distribution comparison of persons‘ gender Method. Main dimensions
  6. 6. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 6 List of professions [based on profession list from “Bundesagentur für Arbeit”] n=4457: • "Lehrer": "Lehrerin", • "Krankenpfleger": "Krankenschwester“, • "Entbindungspfleger": "Hebamme", • "PR-Fachkraft", "Fotomodell", "Aufsichtsperson" Wikipedia • Articles about professions • Images from profession articles • Mentioned people in profession articles Datasets
  7. 7. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 7 Pages exist 1. Redirection analysis. Terminology
  8. 8. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 8 Pages exist 1. Redirection analysis. Terminology
  9. 9. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 9 Pages exist 1. Redirection analysis. Terminology Neutral case (no bias)
  10. 10. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 10 Pages exist No page 1. Redirection analysis. Terminology Neutral case (no bias)
  11. 11. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 11 Pages exist No page 1. Redirection analysis. Terminology Neutral case (no bias)
  12. 12. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 12 Pages exist No page 1. Redirection analysis. Terminology Neutral case (no bias)
  13. 13. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 13 Pages exist No page 1. Redirection analysis. Terminology Neutral case (no bias) Male bias
  14. 14. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 14 Pages exist No page 1. Redirection analysis. Terminology Neutral case (no bias) Male bias Female bias
  15. 15. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 15 Pages exist No page Redirects 1. Redirection analysis. Terminology Neutral case (no bias) Male bias Female bias Male bias Female bias
  16. 16. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 16 • most articles have male title • most redirects are from female to male title • 885 articles 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Male title [n=4274] Female title [n=4274] Neutral title [n=183] No page Redirects Wiki pages Redirection analysis. Results 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Male title [n=503] Female title [n=310] Neutral title [n=7] Redirects to opposit gender Other redirects
  17. 17. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 17 • most articles have male title • most redirects are from female to male title • 885 articles 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Male title [n=4274] Female title [n=4274] Neutral title [n=183] No page Redirects Wiki pages Redirection analysis. Results 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Male title [n=503] Female title [n=310] Neutral title [n=7] Redirects to opposit gender Other redirects Redirection bias groups: Male: 812 professions Female: 6 professions Neutral: 55 professions
  18. 18. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 18 Data: Google hits for profession names Is it only Wikipedia specific phenomena?
  19. 19. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 19 Data: Google hits for profession names German speaking web is a male biased -> more sources for male than female profession names Is it only Wikipedia specific phenomena?
  20. 20. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 20 Data: Google hits for profession names 𝑁𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑒𝑑_𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑖 = 𝐻𝑖𝑡𝑠 𝑚𝑎𝑙𝑒 𝑖 − 𝐻𝑖𝑡𝑠𝑓𝑒𝑚𝑎𝑙𝑒 𝑖 𝐻𝑖𝑡𝑠 𝑚𝑎𝑙𝑒 𝑖 + 𝐻𝑖𝑡𝑠𝑓𝑒𝑚𝑎𝑙𝑒 𝑖 Dependent variable: Model1: binary state of having male bias Model2: binary state of having female bias Does Wikipedia reflects the general bias on the Web? coef Model1 Normalized Google difference 2.44*** (intercept) 2.41*** Model2 Normalized Google difference -5.93** (intercept) -5.55** Logistic regression models:
  21. 21. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 21 Could we explain Wikipedia phenomena with labor market statistics? Bias groups z male & neutral 0.83 male & female -3.32** neutral & female -3.35** RankSum tests1Data: German labor market statistics ¹ with two stage p-value correction of Benjamini-Hochberg Logistic regression model: Dependent variable: binary state of having female bias coef percentage of women 0.36** (intercept) -35.5**
  22. 22. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 22 Data: Images from profession articles CrowdFlower task 2. Images analysis
  23. 23. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 23 Images analysis. Results • 906 images from 885 Wikipedia articles • 3 judges per photo -> response of majority [reliability of agreement κ = 0.75] • Group images according to responses: • „male“, „only male“, „mixed, but predominantly male” -> men in image • „female“, „only female“, „mixed, but predominantly female” -> women in image • „mixed, equal amount of men and women“ • „gender is not recognizable“ • „no person“
  24. 24. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 24 Do Wikipedia images reflect labor market statistics?
  25. 25. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 25 Relation to labor market statistics German labor market statisticsImage results Feature 1 Feature 2 Correlation The more images depicting / The higher the percentage of … number of images depicted women number of women in the labor market women are in the article, the more women are working in the profession number of images depicted men number of men in the labor market 0.088 - percentage of images depicted men percentage of men in the labor market; percentage of women in the labor market images depicting men is in the article, the higher the percentage of men is in the labor market; images depicting men is in the article, the lower the percentage of women is in the labor market percentage of images depicted women percentage of women in the labor market images depicting women is in the article, the higher the percentage of women is in the labor market 0.34*** -0.3*** 0.15* 0.3***
  26. 26. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 26 Data: Mentioned people from profession articles • gender identification according to the first name (accuracy=0.97) • 5085 (4272 men and 813 women) persons from 885 articles • 411 articles with at least one person Distribution of ratios of male names in an article 3. Textual analysis mean 0.83 median 0.98 25% 0.8 75% 1.0 avg.number of persons per article 10.4 m 1.9 f Male bias Female bias
  27. 27. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 27 Do articles with male title have higher ratio of mentioned men than articles with • neutral title? • female title? Do articles with neutral title have higher ratio of mentioned men than articles with female title? Is there an effect of gender of article title on ratio of mentioned men? male & female z=2.46* median“male title” 1.0 median“female title” 0.65 Rank sum tests1 H0: Two sets of ratios of mentioned men are drawn from the same distribution Halt: Values in one set are more likely to be larger than the values in the other sample ¹ with two stage p-value correction of Benjamini-Hochberg
  28. 28. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 28 Relation to labor market statistics German labor market statistics Feature 1 Feature 2 Correlation The higher the percentage of / The more … percentage of mentioned men percentage of women in the labor market mentioned men is in the article, the lower the percentage of women is in the profession number of mentioned men number of people in the labor market men are mentioned in the article, the fewer people are employed in the profession number of mentioned men number of men in the labor market men are mentioned in the article, the fewer men are employed in the profession number of mentioned men number of women in the labor market men are mentioned in the article, the fewer women are employed in the profession -0.27 -0.2 -0.15 -0.23 Mentioned people in an article *** ** *** *** No correlation between: • number of mentioned women & number of women in labor market • number of mentioned women & number of men in labor market
  29. 29. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 29 • dbpedia -> birthDate • divide people on those which were born before&after 1960 before 1960 after 1960 • Negative correlation remains between ratio of mentioned men & percentage of women in labor market Is there an effect of history on number of mentioned men cor # mentioned men amount of people in labor market # mentioned men amount of men in labor market # mentioned men amount of women in labor market - 0.19** -0.15* - 0.20** cor # mentioned men amount of people in labor market -0.12* # mentioned men amount of men in labor market -0.12 # mentioned men amount of women in labor market -0.11
  30. 30. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 30 • Male bias over all dimensions: • redirections • images • mentioned people • High female bias for some professions • Examples: “Model”(mentioned people), “Hebamme”(images) Summary
  31. 31. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 31 Why does the male bias exist on Wikipedia? • male editors • implicit stereotypes of each individual • male bias over other media (including Search engines aka Google) What can be done to reduce it? • attraction of more female editors • development of Wikipedia equality rules • warning editors before acceptance of revision • profession equality lessons for kids Future directions: • cross-language analysis of gender inequalities for different Wiki editions • timestamp analysis of revisions • software tool for Wikipedia editors Discussion & Outlook
  32. 32. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 32 Questions? zagovora@uni-koblenz.de
  33. 33. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 33 Heinrich Heine Fotografen beim Fußball © Ralf Roletschek /CC-BY-SA-3.0/ http://creativecommons.org/licenses/by-sa/3.0 Fotojournalisten bei der Fußball-Europameisterschaft 2008 © Arne Müseler / arne-mueseler.de / CC-BY-SA-3.0 / https://creativecommons.org/licenses/by-sa/3.0/de/deed.de Lothar Loewe bei einem Vortrag im Juli 2009 © Bücherhexe /CC-BY-SA-3.0 / http://creativecommons.org/licenses/by-sa/3.0 1. Jugendolympiade 2012 Innsbruck © Ralf Roletschek / CC-BY-SA-3.0 at / http://creativecommons.org/licenses/by-sa/3.0/at/deed.en Reporter Heinz abel (PHOENIX) im Gespräch mit Peter Fahrenholz © André Zahn/ CC-BY-SA-2.0 de / http://creativecommons.org/licenses/by-sa/2.0/de/deed.en Bob Woodward, assistant managing editor © Jim Wallace (Smithsonian Institution) / CC-BY-2.0 / http://creativecommons.org/licenses/by/2.0 Journalisten bei der Fußball-Europameisterschaft 2008 © Arne Müseler / arne-mueseler.de / CC-BY-SA-3.0 / https://creativecommons.org/licenses/by-sa/3.0/de/deed.de Oriana Fallaci in Tehran 1979 Oprah Winfrey at the Hotel Bel Air in Los Angeles © Alan Light / CC-BY-2.0 / http://creativecommons.org/licenses/by/2.0 Images
  34. 34. Olga Zagovora Measuring gender inequalities of German professions on Wikipedia 34 License Measuring Gender Inequalities of German Professions on Wikipedia by Olga Zagovora is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Based on a work at https://arxiv.org/abs/1702.00829.

×