It's a Man's Wikipedia?

Claudia Wagner
Claudia WagnerPost Doc at GESIS & Univ. of Koblenz
Claudia Wagner Berlin, March 2015
It's a Man's Wikipedia?
Who are your life heroes?
How did you learn about them?
The heroes we share are the
heroes we have
Our Study
• Compare for men and women:
– Coverage
– Lexical Presentation
– Structural Position
– Visibility
Claudia Wagner, David Garcia, Mohsen Jadidi and Markus Strohmaier, It's a
Man's Wikipedia? Assessing Gender Inequality in an Online Encyclopedia,
The International AAAI Conference on Web and Social Media (ICWSM2015)
Coverage
Coverage in 2011
• Britannica versus Wikipedia Coverage
– Reference Lists: e.g. The Atlantic’s 100 most
influential figures in American history
– Wikipedia misses 13% of women and 5% of men
– Britannica misses 49% of women and 33% of men
– Wikipedia’s coverage is more exhaustive
– Women have a 2.6 (13/5) greater odds of
omission in Wikipedia and a 1.48 (49/33) greater
odds of omission in Britannica
Reagle, Joseph; Rhue, Lauren (2011). "Gender Bias in Wikipedia and Britannica". International
Journal of Communication (Joseph Reagle & Lauren Rhue) 5: 1138–1158.
Our Study: Data
• 11% women in Freebase
• 3% women in HA (people who made contributions to
arts and science prior than 1950)
• 13% women in pantheon
Coverage
Visibility
Visibility
Visibility
Structure
Asymmetry
L(from=M, to=W) = -0.26
L(from=W, to=M) = -0.14
Asymmetry
Asymmetry
Assortativity
L(from=M, to=M) = 0.28
L(from=W, to=M) = 0.15
Assortativity
Assortativity
Importance
So what?!?!
Algorithms often use structural properties
to determine importance (e.g. Page Rank)
– Researchers need to understand social
consequences of algorithms
– 28. Feb 2015: “Google wants to rank
websites based on facts not links”,
NewScientist
http://www.newscientist.com/article/mg22530102.600-google-wants-to-
rank-websites-based-on-facts-not-links.html
Page Rank
Eom YH, Aragón P, Laniado D, Kaltenbrunner A, Vigna S, et al. (2015) Interactions
of Cultures and Top People of Wikipedia from Ranking of 24 Language Editions.
PLoS ONE 10(3): e0114825. doi:10.1371/journal.pone.0114825
http://127.0.0.1:8081/plosone/article?id=info:doi/10.1371/journal.pone.0114825
Text
Finkbeiner Test
http://en.wikipedia.org/wiki/Finkbeiner_test
Discriminative Words (DE)
Women
• Autorin
• Ehemann
• Künsterlin
• Gatte
• Schriftstellerin
• Herzoging
• Weiblich
• Tänzerin
• Schauspielerin
• Mrs
• Großmutter
• Tante
• Miss
• Heirat
• Freundin
• Prinzessin
• Gemahlin
Men
• Befördert
• Reprasentantenhaus
• Directory
• Amtszeit
• Republican
• Division
• Senat
• Gouverneur
• Congress
• Biographical
• Mannschaft
• Rechtsanwalt
• Senator
• Expedition
• Demokrat
• Professor
Text
Discriminative Words (EN)
Discriminative Words (ES)
Text
“Biographies of women on Wikipedia
disproportionately focus on marriage
and divorce compared to those of
men.”
David Bamman, Noel Smith. "Unsupervised Discovery of Biographical
Structure from Text", Transactions of the Association for Computational
Linguistics, 2, 2014 (pp. 363–376), p. 369:
Summary
• Good News:
– Visibility and Coverage of women looks good
• Bad News:
– Structural Inequality  what are the
consequences?
– How women are portrayed needs to be
improved
http://en.m.wikipedia.org/wiki/User:GGTF/Writing_about_women
Article-Writing Interaction Graph
Evolution
WikiWho and WikiVis
wikiwho
Fabian Flöck
WikiWho Plugin
Fabian Flöck
WhoVis
Fabian Flöck
It's a Man's Wikipedia?
Future Questions…
• What causes the bias?
– Wikipedia bias versus general media bias?
– Male versus female editors?
• Bias over time
– Does the community improve?
Thank You
claudia.wagner@gesis.org fabian.flöck@gesis.org
Infos zu WikiWho and WikiVis
http://f-squared.org/wikiwho/
1 of 36

Recommended

Measuring Gender Inequality in Wikipedia by
Measuring Gender Inequality in WikipediaMeasuring Gender Inequality in Wikipedia
Measuring Gender Inequality in WikipediaClaudia Wagner
2.1K views32 slides
Slam about "Discrimination and Inequalities in socio-computational systems" by
Slam about "Discrimination and Inequalities in socio-computational systems"Slam about "Discrimination and Inequalities in socio-computational systems"
Slam about "Discrimination and Inequalities in socio-computational systems"Claudia Wagner
1.3K views11 slides
RoRILaunch 4 CULTURES King by
RoRILaunch 4 CULTURES KingRoRILaunch 4 CULTURES King
RoRILaunch 4 CULTURES KingRoRInstitute
66 views20 slides
Media literacy panel by
Media literacy panel Media literacy panel
Media literacy panel Cody Hennesy
1.1K views15 slides
Behind the Gate: challenges facing archivists in academic research libraries by
Behind the Gate: challenges facing archivists in academic research librariesBehind the Gate: challenges facing archivists in academic research libraries
Behind the Gate: challenges facing archivists in academic research librariesAudra Eagle Yun
4.6K views19 slides
Making Sense of Abundance: Opportunity and Challenges Across Three Web Archiv... by
Making Sense of Abundance: Opportunity and Challenges Across Three Web Archiv...Making Sense of Abundance: Opportunity and Challenges Across Three Web Archiv...
Making Sense of Abundance: Opportunity and Challenges Across Three Web Archiv...Ian Milligan
406 views57 slides

More Related Content

What's hot

Literature Review for the Research Capstone by
Literature Review for the Research CapstoneLiterature Review for the Research Capstone
Literature Review for the Research CapstoneNicoleBranch
1.2K views15 slides
UIUC SLIS LIS531 MiniCourse Endangered Data Week by
UIUC SLIS LIS531 MiniCourse Endangered Data WeekUIUC SLIS LIS531 MiniCourse Endangered Data Week
UIUC SLIS LIS531 MiniCourse Endangered Data Weekaaroncollie
209 views27 slides
Lowering veteran suicide rates by
Lowering veteran suicide rates Lowering veteran suicide rates
Lowering veteran suicide rates Anthony Clendenen
85 views7 slides
Writer's resume 1 by
Writer's resume 1Writer's resume 1
Writer's resume 1John Davis
193 views3 slides
WARCs, WATs, and wgets: Opportunity and Challenge for a Historian Amongst Thr... by
WARCs, WATs, and wgets: Opportunity and Challenge for a Historian Amongst Thr...WARCs, WATs, and wgets: Opportunity and Challenge for a Historian Amongst Thr...
WARCs, WATs, and wgets: Opportunity and Challenge for a Historian Amongst Thr...Ian Milligan
4.3K views47 slides
Good Riddance: Academic Publishers are Abandoning Publishing by
Good Riddance: Academic Publishers are Abandoning PublishingGood Riddance: Academic Publishers are Abandoning Publishing
Good Riddance: Academic Publishers are Abandoning PublishingBjörn Brembs
144 views49 slides

What's hot(18)

Literature Review for the Research Capstone by NicoleBranch
Literature Review for the Research CapstoneLiterature Review for the Research Capstone
Literature Review for the Research Capstone
NicoleBranch1.2K views
UIUC SLIS LIS531 MiniCourse Endangered Data Week by aaroncollie
UIUC SLIS LIS531 MiniCourse Endangered Data WeekUIUC SLIS LIS531 MiniCourse Endangered Data Week
UIUC SLIS LIS531 MiniCourse Endangered Data Week
aaroncollie209 views
Writer's resume 1 by John Davis
Writer's resume 1Writer's resume 1
Writer's resume 1
John Davis193 views
WARCs, WATs, and wgets: Opportunity and Challenge for a Historian Amongst Thr... by Ian Milligan
WARCs, WATs, and wgets: Opportunity and Challenge for a Historian Amongst Thr...WARCs, WATs, and wgets: Opportunity and Challenge for a Historian Amongst Thr...
WARCs, WATs, and wgets: Opportunity and Challenge for a Historian Amongst Thr...
Ian Milligan4.3K views
Good Riddance: Academic Publishers are Abandoning Publishing by Björn Brembs
Good Riddance: Academic Publishers are Abandoning PublishingGood Riddance: Academic Publishers are Abandoning Publishing
Good Riddance: Academic Publishers are Abandoning Publishing
Björn Brembs144 views
Loud Library Voices by Gary Green
Loud Library VoicesLoud Library Voices
Loud Library Voices
Gary Green1.5K views
It's the end of the world as we know it, and i feel fine by Martin Hamilton
It's the end of the world as we know it, and i feel fineIt's the end of the world as we know it, and i feel fine
It's the end of the world as we know it, and i feel fine
Martin Hamilton534 views
Alms08 Greenblatt by glbtalms
Alms08 GreenblattAlms08 Greenblatt
Alms08 Greenblatt
glbtalms473 views
Surviving in the Academy: Issues and Challenges in Gender (In)Equality in Sc... by WiMBE_IFMBE
Surviving in the Academy:Issues and Challenges in Gender (In)Equality in Sc...Surviving in the Academy:Issues and Challenges in Gender (In)Equality in Sc...
Surviving in the Academy: Issues and Challenges in Gender (In)Equality in Sc...
WiMBE_IFMBE261 views
The biggest threat to science today: the scholarly publishing system by Björn Brembs
The biggest threat to science today: the scholarly publishing systemThe biggest threat to science today: the scholarly publishing system
The biggest threat to science today: the scholarly publishing system
Björn Brembs91 views
Will Chang MAS2019 by GWT
Will Chang MAS2019Will Chang MAS2019
Will Chang MAS2019
GWT126 views
Who ARE the People in your Neighborhood? Developing Mapzen's Neighborhood Dat... by nacis_slides
Who ARE the People in your Neighborhood? Developing Mapzen's Neighborhood Dat...Who ARE the People in your Neighborhood? Developing Mapzen's Neighborhood Dat...
Who ARE the People in your Neighborhood? Developing Mapzen's Neighborhood Dat...
nacis_slides144 views
Journal of Family Life by ekurylo
Journal of Family LifeJournal of Family Life
Journal of Family Life
ekurylo399 views

Viewers also liked

Carta de exposición de motivos by
Carta de exposición de motivosCarta de exposición de motivos
Carta de exposición de motivosDanitza Torrez
205.7K views1 slide
O que faz e como está o mercado para um profissional de TI? by
O que faz e como está o mercado para um profissional de TI?O que faz e como está o mercado para um profissional de TI?
O que faz e como está o mercado para um profissional de TI?Felipe Pereira
492 views12 slides
Cartas exposicion de motivos by
Cartas exposicion de motivosCartas exposicion de motivos
Cartas exposicion de motivosFann Andrade
123.8K views7 slides
Carta de exposicion de motivos by
Carta de exposicion de motivosCarta de exposicion de motivos
Carta de exposicion de motivosCRISTAL CORRALES
331K views2 slides
Normas de trabajos febrero 2005 (1) by
Normas de trabajos febrero 2005 (1)Normas de trabajos febrero 2005 (1)
Normas de trabajos febrero 2005 (1)carmen gomez
3.2K views31 slides
Informe de pasantias unermb iuta orson serrano 2015 by
Informe de pasantias unermb iuta orson serrano 2015Informe de pasantias unermb iuta orson serrano 2015
Informe de pasantias unermb iuta orson serrano 2015Orson Serrano
6.6K views44 slides

Viewers also liked(20)

Carta de exposición de motivos by Danitza Torrez
Carta de exposición de motivosCarta de exposición de motivos
Carta de exposición de motivos
Danitza Torrez205.7K views
O que faz e como está o mercado para um profissional de TI? by Felipe Pereira
O que faz e como está o mercado para um profissional de TI?O que faz e como está o mercado para um profissional de TI?
O que faz e como está o mercado para um profissional de TI?
Felipe Pereira492 views
Cartas exposicion de motivos by Fann Andrade
Cartas exposicion de motivosCartas exposicion de motivos
Cartas exposicion de motivos
Fann Andrade123.8K views
Normas de trabajos febrero 2005 (1) by carmen gomez
Normas de trabajos febrero 2005 (1)Normas de trabajos febrero 2005 (1)
Normas de trabajos febrero 2005 (1)
carmen gomez3.2K views
Informe de pasantias unermb iuta orson serrano 2015 by Orson Serrano
Informe de pasantias unermb iuta orson serrano 2015Informe de pasantias unermb iuta orson serrano 2015
Informe de pasantias unermb iuta orson serrano 2015
Orson Serrano6.6K views
Proyecto reforma a las escuelas de criminología by Wael Hikal
Proyecto reforma a las escuelas de criminologíaProyecto reforma a las escuelas de criminología
Proyecto reforma a las escuelas de criminología
Wael Hikal3.5K views
Concepto De Sistema De Salud by Jhanes Calcano
Concepto De Sistema De SaludConcepto De Sistema De Salud
Concepto De Sistema De Salud
Jhanes Calcano1.8K views
Relatório descritivo do curso de Educação a Distância: Construindo um Pro... by Caio Moreno
Relatório descritivo do curso de Educação a Distância: Construindo um Pro...Relatório descritivo do curso de Educação a Distância: Construindo um Pro...
Relatório descritivo do curso de Educação a Distância: Construindo um Pro...
Caio Moreno13.2K views
Pré Projeto: Avaliação do Conhecimento dos Graduandos de Enfermagem sobre asp... by Camila Ferreira
Pré Projeto: Avaliação do Conhecimento dos Graduandos de Enfermagem sobre asp...Pré Projeto: Avaliação do Conhecimento dos Graduandos de Enfermagem sobre asp...
Pré Projeto: Avaliação do Conhecimento dos Graduandos de Enfermagem sobre asp...
Camila Ferreira5K views
Ensayo Sobre La Importancia Del Intercambio Estudiantil by medic
Ensayo Sobre La Importancia Del Intercambio EstudiantilEnsayo Sobre La Importancia Del Intercambio Estudiantil
Ensayo Sobre La Importancia Del Intercambio Estudiantil
medic38.9K views
Contenido del modelo informe de pasantias IUTAJS by Mabel Apa
Contenido del modelo informe de pasantias IUTAJSContenido del modelo informe de pasantias IUTAJS
Contenido del modelo informe de pasantias IUTAJS
Mabel Apa31.3K views
Carta de exposicion de motivos (1) by Fann Andrade
Carta de exposicion de motivos (1)Carta de exposicion de motivos (1)
Carta de exposicion de motivos (1)
Fann Andrade17.6K views
Teoria de enfermagem de florence nightingale by enfanhanguera
Teoria de enfermagem de florence nightingaleTeoria de enfermagem de florence nightingale
Teoria de enfermagem de florence nightingale
enfanhanguera56.2K views
Carta de motivos by Lupis Sango
Carta de motivosCarta de motivos
Carta de motivos
Lupis Sango10.5K views
Hoja De Vida De Microsoft Word (4) by jimmyfavian
Hoja De Vida De Microsoft Word (4)Hoja De Vida De Microsoft Word (4)
Hoja De Vida De Microsoft Word (4)
jimmyfavian15.5K views

Similar to It's a Man's Wikipedia?

Data Science in the era of Fake News by
Data Science in the era of Fake NewsData Science in the era of Fake News
Data Science in the era of Fake NewsPablo Aragón
1.3K views20 slides
DMI Summer 2010 - Final Presentations by
DMI Summer 2010 - Final PresentationsDMI Summer 2010 - Final Presentations
DMI Summer 2010 - Final PresentationsDigital Methods Initiative
1.7K views111 slides
What Does Your Repository Do? Measuring and Calculating Impact by
What Does Your Repository Do?  Measuring and Calculating ImpactWhat Does Your Repository Do?  Measuring and Calculating Impact
What Does Your Repository Do? Measuring and Calculating ImpactMargaret Heller
3.3K views19 slides
Closing the Gender Gap on Wikimedia by
Closing the Gender Gap on WikimediaClosing the Gender Gap on Wikimedia
Closing the Gender Gap on WikimediaJohn Lubbock
497 views10 slides
Mitigating microaggressions in virtual reference by
Mitigating microaggressions in virtual referenceMitigating microaggressions in virtual reference
Mitigating microaggressions in virtual referenceLynn Connaway
356 views23 slides
ACL2008 by
ACL2008ACL2008
ACL2008Frank Quinn
473 views32 slides

Similar to It's a Man's Wikipedia? (20)

Data Science in the era of Fake News by Pablo Aragón
Data Science in the era of Fake NewsData Science in the era of Fake News
Data Science in the era of Fake News
Pablo Aragón1.3K views
What Does Your Repository Do? Measuring and Calculating Impact by Margaret Heller
What Does Your Repository Do?  Measuring and Calculating ImpactWhat Does Your Repository Do?  Measuring and Calculating Impact
What Does Your Repository Do? Measuring and Calculating Impact
Margaret Heller3.3K views
Closing the Gender Gap on Wikimedia by John Lubbock
Closing the Gender Gap on WikimediaClosing the Gender Gap on Wikimedia
Closing the Gender Gap on Wikimedia
John Lubbock497 views
Mitigating microaggressions in virtual reference by Lynn Connaway
Mitigating microaggressions in virtual referenceMitigating microaggressions in virtual reference
Mitigating microaggressions in virtual reference
Lynn Connaway356 views
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ... by Daniel McLinden
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
Daniel McLinden761 views
Public 2021 course outline indg 2015 by Zoe Todd
Public 2021 course outline indg 2015Public 2021 course outline indg 2015
Public 2021 course outline indg 2015
Zoe Todd2.6K views
Meyer Big Data SDP13 by Eric Meyer
Meyer Big Data SDP13Meyer Big Data SDP13
Meyer Big Data SDP13
Eric Meyer776 views
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ... by Daniel McLinden
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
Daniel McLinden73 views
Optimising for Cultural Learning - Velocity EU 2013 by Christopher Read
Optimising for Cultural Learning - Velocity EU 2013Optimising for Cultural Learning - Velocity EU 2013
Optimising for Cultural Learning - Velocity EU 2013
Christopher Read1.9K views
Isabel Galina Russell, 'Geopolitical diversity in Digital Humanities: how do ... by UCLDH
Isabel Galina Russell, 'Geopolitical diversity in Digital Humanities: how do ...Isabel Galina Russell, 'Geopolitical diversity in Digital Humanities: how do ...
Isabel Galina Russell, 'Geopolitical diversity in Digital Humanities: how do ...
UCLDH355 views
Literacies lightning round academic librarians by ca92
Literacies lightning round academic librariansLiteracies lightning round academic librarians
Literacies lightning round academic librarians
ca921.3K views
Dissemination of scholarly literature in social media by Pablo Moriano
Dissemination of scholarly literature in social mediaDissemination of scholarly literature in social media
Dissemination of scholarly literature in social media
Pablo Moriano1.1K views
Open access for researchers, research managers and libraries by Iryna Kuchma
Open access for researchers, research managers and librariesOpen access for researchers, research managers and libraries
Open access for researchers, research managers and libraries
Iryna Kuchma821 views
2013 ifla satellite zarndt et al [crowdsourcing the world's cultural heritage... by Frederick Zarndt
2013 ifla satellite zarndt et al [crowdsourcing the world's cultural heritage...2013 ifla satellite zarndt et al [crowdsourcing the world's cultural heritage...
2013 ifla satellite zarndt et al [crowdsourcing the world's cultural heritage...
Frederick Zarndt2.3K views
Assessing inter-cultural patterns through ranking biographiesBiographies by Pablo Aragón
Assessing inter-cultural patterns through ranking biographiesBiographiesAssessing inter-cultural patterns through ranking biographiesBiographies
Assessing inter-cultural patterns through ranking biographiesBiographies
Pablo Aragón2.8K views
ARC 211: American Diversity and Design: HEATHER LEVENTHAL by Heather Leventhal
ARC 211: American Diversity and Design: HEATHER LEVENTHALARC 211: American Diversity and Design: HEATHER LEVENTHAL
ARC 211: American Diversity and Design: HEATHER LEVENTHAL
Heather Leventhal166 views

More from Claudia Wagner

Food and Culture by
Food and CultureFood and Culture
Food and CultureClaudia Wagner
4K views35 slides
Datascience Introduction WebSci Summer School 2014 by
Datascience Introduction WebSci Summer School 2014Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014Claudia Wagner
1.5K views86 slides
When politicians talk: Assessing online conversational practices of political... by
When politicians talk: Assessing online conversational practices of political...When politicians talk: Assessing online conversational practices of political...
When politicians talk: Assessing online conversational practices of political...Claudia Wagner
1K views3 slides
WWW2014 Semantic Stability in Social Tagging Streams by
WWW2014 Semantic Stability in Social Tagging StreamsWWW2014 Semantic Stability in Social Tagging Streams
WWW2014 Semantic Stability in Social Tagging StreamsClaudia Wagner
1.1K views47 slides
Welcome 1st Computational Social Science Workshop 2013 at GESIS by
Welcome 1st Computational Social Science Workshop 2013 at GESISWelcome 1st Computational Social Science Workshop 2013 at GESIS
Welcome 1st Computational Social Science Workshop 2013 at GESISClaudia Wagner
963 views9 slides
Spatio and Temporal Dietary Patterns by
Spatio and Temporal Dietary PatternsSpatio and Temporal Dietary Patterns
Spatio and Temporal Dietary PatternsClaudia Wagner
757 views15 slides

More from Claudia Wagner(16)

Datascience Introduction WebSci Summer School 2014 by Claudia Wagner
Datascience Introduction WebSci Summer School 2014Datascience Introduction WebSci Summer School 2014
Datascience Introduction WebSci Summer School 2014
Claudia Wagner1.5K views
When politicians talk: Assessing online conversational practices of political... by Claudia Wagner
When politicians talk: Assessing online conversational practices of political...When politicians talk: Assessing online conversational practices of political...
When politicians talk: Assessing online conversational practices of political...
Claudia Wagner1K views
WWW2014 Semantic Stability in Social Tagging Streams by Claudia Wagner
WWW2014 Semantic Stability in Social Tagging StreamsWWW2014 Semantic Stability in Social Tagging Streams
WWW2014 Semantic Stability in Social Tagging Streams
Claudia Wagner1.1K views
Welcome 1st Computational Social Science Workshop 2013 at GESIS by Claudia Wagner
Welcome 1st Computational Social Science Workshop 2013 at GESISWelcome 1st Computational Social Science Workshop 2013 at GESIS
Welcome 1st Computational Social Science Workshop 2013 at GESIS
Claudia Wagner963 views
Spatio and Temporal Dietary Patterns by Claudia Wagner
Spatio and Temporal Dietary PatternsSpatio and Temporal Dietary Patterns
Spatio and Temporal Dietary Patterns
Claudia Wagner757 views
The Impact of Socialbots in Online Social Networks by Claudia Wagner
The Impact of Socialbots in Online Social NetworksThe Impact of Socialbots in Online Social Networks
The Impact of Socialbots in Online Social Networks
Claudia Wagner770 views
It’s not in their tweets: Modeling topical expertise of Twitter users by Claudia Wagner
It’s not in their tweets: Modeling topical expertise of Twitter users It’s not in their tweets: Modeling topical expertise of Twitter users
It’s not in their tweets: Modeling topical expertise of Twitter users
Claudia Wagner883 views
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ... by Claudia Wagner
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
Ignorance isn't Bliss: An Empirical Analysis of Attention Patterns in Online ...
Claudia Wagner742 views
Topic Models - LDA and Correlated Topic Models by Claudia Wagner
Topic Models - LDA and Correlated Topic ModelsTopic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic Models
Claudia Wagner11.9K views
Knowledge Acquisition from Social Awareness Streams by Claudia Wagner
Knowledge Acquisition from Social Awareness StreamsKnowledge Acquisition from Social Awareness Streams
Knowledge Acquisition from Social Awareness Streams
Claudia Wagner1.6K views
The wisdom in Tweetonomies by Claudia Wagner
The wisdom in TweetonomiesThe wisdom in Tweetonomies
The wisdom in Tweetonomies
Claudia Wagner1.2K views

Recently uploaded

Soco 7.pdf by
Soco 7.pdfSoco 7.pdf
Soco 7.pdfSocioCosmos
9 views1 slide
SOCO 8.pdf by
SOCO 8.pdfSOCO 8.pdf
SOCO 8.pdfSocioCosmos
6 views1 slide
SOCO 12.pdf by
SOCO 12.pdfSOCO 12.pdf
SOCO 12.pdfSocioCosmos
7 views1 slide
digital marketing by
digital marketing digital marketing
digital marketing mdZafar18
5 views1 slide
SOCO 9.pdf by
SOCO 9.pdfSOCO 9.pdf
SOCO 9.pdfSocioCosmos
6 views1 slide
"Mastering Social Media Marketing: A Guide to Fremont's Local Influence and C... by
"Mastering Social Media Marketing: A Guide to Fremont's Local Influence and C..."Mastering Social Media Marketing: A Guide to Fremont's Local Influence and C...
"Mastering Social Media Marketing: A Guide to Fremont's Local Influence and C...Embtel Solutions
24 views19 slides

Recently uploaded(10)

It's a Man's Wikipedia?

  • 1. Claudia Wagner Berlin, March 2015 It's a Man's Wikipedia?
  • 2. Who are your life heroes?
  • 3. How did you learn about them?
  • 4. The heroes we share are the heroes we have
  • 5. Our Study • Compare for men and women: – Coverage – Lexical Presentation – Structural Position – Visibility Claudia Wagner, David Garcia, Mohsen Jadidi and Markus Strohmaier, It's a Man's Wikipedia? Assessing Gender Inequality in an Online Encyclopedia, The International AAAI Conference on Web and Social Media (ICWSM2015)
  • 7. Coverage in 2011 • Britannica versus Wikipedia Coverage – Reference Lists: e.g. The Atlantic’s 100 most influential figures in American history – Wikipedia misses 13% of women and 5% of men – Britannica misses 49% of women and 33% of men – Wikipedia’s coverage is more exhaustive – Women have a 2.6 (13/5) greater odds of omission in Wikipedia and a 1.48 (49/33) greater odds of omission in Britannica Reagle, Joseph; Rhue, Lauren (2011). "Gender Bias in Wikipedia and Britannica". International Journal of Communication (Joseph Reagle & Lauren Rhue) 5: 1138–1158.
  • 8. Our Study: Data • 11% women in Freebase • 3% women in HA (people who made contributions to arts and science prior than 1950) • 13% women in pantheon
  • 14. Asymmetry L(from=M, to=W) = -0.26 L(from=W, to=M) = -0.14
  • 17. Assortativity L(from=M, to=M) = 0.28 L(from=W, to=M) = 0.15
  • 21. So what?!?! Algorithms often use structural properties to determine importance (e.g. Page Rank) – Researchers need to understand social consequences of algorithms – 28. Feb 2015: “Google wants to rank websites based on facts not links”, NewScientist http://www.newscientist.com/article/mg22530102.600-google-wants-to- rank-websites-based-on-facts-not-links.html
  • 22. Page Rank Eom YH, Aragón P, Laniado D, Kaltenbrunner A, Vigna S, et al. (2015) Interactions of Cultures and Top People of Wikipedia from Ranking of 24 Language Editions. PLoS ONE 10(3): e0114825. doi:10.1371/journal.pone.0114825 http://127.0.0.1:8081/plosone/article?id=info:doi/10.1371/journal.pone.0114825
  • 23. Text
  • 25. Discriminative Words (DE) Women • Autorin • Ehemann • Künsterlin • Gatte • Schriftstellerin • Herzoging • Weiblich • Tänzerin • Schauspielerin • Mrs • Großmutter • Tante • Miss • Heirat • Freundin • Prinzessin • Gemahlin Men • Befördert • Reprasentantenhaus • Directory • Amtszeit • Republican • Division • Senat • Gouverneur • Congress • Biographical • Mannschaft • Rechtsanwalt • Senator • Expedition • Demokrat • Professor
  • 26. Text
  • 29. Text “Biographies of women on Wikipedia disproportionately focus on marriage and divorce compared to those of men.” David Bamman, Noel Smith. "Unsupervised Discovery of Biographical Structure from Text", Transactions of the Association for Computational Linguistics, 2, 2014 (pp. 363–376), p. 369:
  • 30. Summary • Good News: – Visibility and Coverage of women looks good • Bad News: – Structural Inequality  what are the consequences? – How women are portrayed needs to be improved http://en.m.wikipedia.org/wiki/User:GGTF/Writing_about_women
  • 31. Article-Writing Interaction Graph Evolution WikiWho and WikiVis wikiwho Fabian Flöck
  • 35. Future Questions… • What causes the bias? – Wikipedia bias versus general media bias? – Male versus female editors? • Bias over time – Does the community improve?
  • 36. Thank You claudia.wagner@gesis.org fabian.flöck@gesis.org Infos zu WikiWho and WikiVis http://f-squared.org/wikiwho/

Editor's Notes

  1. Im Juni diesen Jahres werden wir eine studie veröffentlichen deren Titel mit einer Frage beginnt… Warum ist diese Frage wichtig???
  2. kurzer Gedankenexperiment. Wer sind/waren euere Helden – sprich Menschen die ihr bewundert?
  3. Wikipedia wirs als Wissensquelle immer wichtiger. Deshalb ist es wichtig dass wir uns immer wieder ins Bewusstsein rufen dass die Personen die wir für wichtig genug halten um sie in Wikipedia zu erfassen, die Personen sind über viele Menschen lernen und lesen werden und die somit das Potential haben zu persönlichen Helden zu werden bzw. Höhere Sichtbarkeit haben.
  4. Aus diesem Grund fanden wir Frage wichtig ob es Unterschiede in der Erfassung zw. Männern uns Frauen in Wikipedia gibt. Die Abdeckung, die textuelle Darstellung, struckturelle Position und die Sichtbarkeit von Frauen un Männern verglichen.
  5. Die Wahrscheinlichkeit dass eine wichtige Frau bzw Mann auf Britannica nicht erfasst wurde ist ähnlich hoch, während auf Wikipedia Frauen eine höher chance haben nicht erfasst zu werden. some 1,500 authors contributing to the 11th Britannica, 35 of them were women (about 2%), with no woman listed among the 49 editorial advisors. In Wikipedia around 10% of editors are women.
  6. Da wir uns auch für den Coverage der leute interessiert haben, also wieviele wichtige Frauen bzw. Männer auf Wikipedia beschrieben sind, mussten wir zuerst nach referenzlisten suchen. Also externe Quellen die wichtige Frauen und Männer auflisten. Wir haben uns hier für die folgenden 3 Quellen entschieden: freebase (was nat. nicht unabhängig von Wikipedia ist, aber auch duch eine andere community gepflegt wird und mehre datenquellen anzapft), pantheon (ein MIT projekt) welches semi-manuel wichtige personen auflistet, Human Accomplishment – ein Buch von Harris Murray der Antrophologe ist und manuell die wichtigsten Meschen der Geschichte aufgelistet hat die VOR 1950 wichtige Beitrage in den Wissenschaften oder der Kunst geleistet haben.
  7. Man sieht dass die Helden aus Murray’s liste in allen Wikipedia editionen am besten gecovered werden. Die geringe Frauenrate hat nat. mit der Geschichte zu tun. Dennoch sieht man dass die wenigen Frauen die trotz ungleicher Bedigungen wichtige Beiträge zu Wissenschaft und Kunst leisten konnten sehr gut abgedeckt werden in allen Sprachen!
  8. As study from 2010 says “Nine men to every one woman on a portal that represents the greatest easily accessible store of knowledge is outrageously disproportionate and unacceptable” (RMJ, 2010).
  9. Wieviele Männer/Frauen werden auf der startseite der englischen Wikipedia gefeatured? Wir sehen hier die proportion von Frauen versus die der Männer. Obwohl man leichte Unterschiede sieht, sind diese nicht signifikant.
  10. Gibt es eine Asymmetry im Geschlechterübergreifenden Linknetzwerk. Also linken Männer mehr zu Frauen, also Frauen zu Männer oder anders rum. Wir vergleichen hier die bedingte Wahrscheinlichkeit dass Geschlecht 1 zu Geschlecht 2 linkt mit der unbedingten Wahrscheinlichkeit dass jemand zu Geschecht 2 linkt.
  11. Weniger Links von Männer zu Frauen als von Frauen zu Männern. Oder besser gesagt Männer linken weniger zu Frauen wie Artikel im Durchschnitt. Auch Frauen linken weniger zu Männern wie Artikel im Durchschnitt. Allerdings ist der Effekt für Männer zu Frauen stärker  d.h. sie liegen stärker unter dem Durchnitte.
  12. L(F,M)-L(M,F)  both log liklihood ratios are negative. L(M,F) is smaller than L(F,M). EN: -0.5-(-0.7) = 0.2
  13. Assortativität beschreibt ob in einem Netzwerk eher gleichartige Knoten miteinander vernetzt sind oder ob sich eine Mischung ausbildet. Pos. Koeffizient deutet auf Assortativität hin, negativer Koeffizient deutet auf Mischung hin. L(from=M, to=M) = (5/10) – (5/10) * (7/10) / 1 – ((5/10)*(7/10) +(4/10)*(3/10)) = 0.15/0.53 = 0.28 L(from=W, to=W) = ((2/10)-(4/10)*(3/10)) / 1-((5/10)*(7/10) +(4/10)*(3/10)) = 0.08/0.53 = 0.15
  14. Assortativität kann für beide Geschlechter beobachtet werden, ist aber deutlich stärker ausgeprägt für Frauen.
  15. Average between L(F,F) and L(M,M). - randomized gender model: shuffle the genders of nodes; - randomized link end model: rewire links to random articles, maintaining out degrees but fully randomizing in-degree; - randomized link origin model: maintain link ends but rewire their origin to an article sampled at random, which maintains in-degrees but randomizes out degrees.
  16. Welche Theoretischen Außmaße hat die Linkstruktur. Wenn man sich z.B. einfach die Indegree verteilung anschaut. Also wieviele incomming links haben Artikel über Frauen versus Männer, sieht man dass es Artikel über Männer gibt die sehr viele inlinks haben. Artikel über Frauen haben das nicht.
  17. Core is broadly defined as a maximum size subgraph of a graph that is coherent and dense. Find a subgraph where all nodes have enough out-links and in-links to the rest of it. Clearly, it is not enough for a node to have big in-degree and/or out-degree in order to be a member of such a core. What counts, on the top of this, is that the node forms part of a community where each of its members satisfy the same in-degree and/or out-degree requirements with respect to all the other community members
  18. Anzahl an Frauen in den top 100 page rank results die sich auf Bios beziehen.
  19. Die letzte Dimension die wir betrachtet haben war die textuelle Beschreibung von Frauen und Männer. Gibt es hier Unterschiede die über das was wir vielleicht noch erwarten würden hinausgehen. Der Finkbeiner Test listet Aspekte auf die üblicherweise in Biographien von Frauen erwähnt werden aber nicht in denen von Männers: z.B. Familienstatus, Geschlecht… TFIDF of words (worte wir Frau können hohe scores kriegen weil sie nur in der minority klasse vorkommen. Aber worte wie Hochzeit, Scheidung usw. nicht). Trainieren Naïve Bias classifier und lassen den classifier die Worte wählen die am effektifsten sind um die beiden Klassen zu unterscheiden.
  20. Deutsche Wikipedia für Männer und Frauen
  21. Mehr als 1/3 der diskriminativsten Worte für Artikel über Frauen gehören zu einer der 3 Kategorien. Für Männer hingegen fallen nur 0-3% der diskriminativsten Worte in diese Kategory. Das männliche Geschlecht als null-gender? Man muss nicht erwähnen dass es um einen Mann geht weil der Kontext das bereits definiert. Vorallem in der Englishen und Russian Wikipedia sieht man dass die top 25 worte für die Klasse Frauen überwiegend in eine der 3 Cats fallen.
  22. Beispiel der diskriminativsten Worte für Worte in Englischen Wikipedia
  23. In unserer bisherigen Arbeit haben wir nicht die Frage nach dem WARUM beantwortet. Also es bleibt unklar ob die biases den wir messen nur ein historisches Artefakt sind oder durch die Editor Community verursacht werden. Diese Frage kann man sich aber anähern wenn man tools hat die den Editing Process transparent machen. An der Entwicklung solche tools arbeitet ein Kollege. Konkret geht es darum die gesamte Geschichte des collaborativen Editing Process transparent zu machen und in der aktuellen Revision anzuzeigen wo die Information herkommt. Welche Worte stammen ursprünglich von wem. Wer hat den Text von wem verändert, gelöscht usw.