SlideShare a Scribd company logo
Gender Gap in Collaborative Platforms:
Language and emotions in Wikipedia Discussions
David Laniado, Daniela Iosub, Carlos Castillo,
Mayo Fuster Morell and Andreas Kaltenbrunner
david.laniado@eurecat.org
Universitat Pompeu Fabra, January 17, 2017
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 1 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 2 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 3 / 58
Wikipedia is a teenager
Happy birthday
Wikipedia!
English Wikipedia is
now 16 years old
Catalan Wikipedia will
be 16 in March (the
second oldest one)
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 4 / 58
The largest human knowledge repository
Fifth most visited web site
Among top results for search queries about almost any topic
Largest collaborative project
Conditions and reflects public opinion... with some bias
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 5 / 58
Wikipedia’s social experiment
"The problem with Wikipedia is that it only works in practice. In theory,
it can never work."
(Wikipedian popular joke)
A crazy idea: anyone can edit
All relevant points of view should be represented
Policies of Notability and Neutral Point of view
Quality assured by editors’ negotiations over content
The more people with different points of view contributing, the
better the quality
→ Biases in the editor community may cause biases in the content
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 6 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 7 / 58
Bias in the content?
Top global biographies by birth country (Young-Ho Eom et al, 2015)
top central biographies from each of the 24 major Wikipedias
the 100 most central (PageRank) in each version’s hyperlink network
→ striking geographic bias
http://www.quantware.ups-tlse.fr/QWLIB/topwikipeople/geofigs/pagerank24x100.html
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 8 / 58
Top Women Biographies
Rank NA PageRank female figures CC Century LC
1 24 Elizabeth II UK 20 EN
2 17 Mary (mother of Jesus) IL -1 HE
3 12 Queen Victoria UK 19 EN
4 6 Elizabeth I of England UK 16 EN
5 2 Maria Theresa AT 18 DE
6 1 Benazir Bhutto PK 20 HI
7 1 Catherine the Great PL 18 PL
8 1 Anne Frank DE 20 DE
9 1 Indira Gandhi IN 20 HI
10 1 Margrethe II of Denmark DK 20 DA
Top 10 global female historical figures by PageRank for the 24 major
Wikipedia editions (Young-Ho Eom et al, 2015)
NA → number of language editions in which a biography appears in the
top 100 rank
CC → birth country code
LC → language code corresponding to the birth country
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 9 / 58
Top biographies by gender
Number of women among the top global biographies by birth century
Number of women among the 100 most central biographies for each
language edition
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 10 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 11 / 58
Wikipedia editor gender gap
Estimated women participation
2011 editor survey: 9%
2013 editor survey: 13%
corrected with propensity score estimation: 16% (Mako Hill and
Shaw, 2013)
while in most online social networks women are more active!
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 12 / 58
Why women do not edit Wikipedia?
1 A lack of user-friendliness in the editing interface
2 Not having enough free time
3 A lack of self-confidence
4 Aversion to conflict and an unwillingness to participate in lengthy
edit wars
5 Belief that their contributions are too likely to be reverted or
deleted
6 Some find its overall atmosphere misogynistic
7 Wikipedia culture is sexual in ways they find off-putting
8 Being addressed as male is off-putting to women whose primary
language has grammatical gender
9 Fewer opportunities than other sites for social relationships and a
welcoming tone
(Sue Gardener, 2011)
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 13 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 14 / 58
Emotional factors and discussions
Importance of
emotional factors
Discussion spaces are
fundamental to the
collaborative process
Discussion triggers
emotions and breeds
particular emotional
environments
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 15 / 58
Wikipedia’s most visible side
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 16 / 58
Article talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 17 / 58
Discussions in article talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 18 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 19 / 58
Studying emotions and language in talk pages
Interactions in Wikipedia
Implicit → editing
Explicit → communication
Article talk pages → discussions about how to improve articles
User talk pages → a kind of public in-boxes
Goal: Shed light on the emotional dimension of the interactions
extensive analysis of emotions in explicit communication
sentiment analysis of comments in article talk and personal talk
pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 20 / 58
Research questions
1 How are the emotional and communication styles of editors
affected by their status?
2 How are the emotional and communication styles of editors
affected by their gender?
3 How are the emotional expressions affected by interacting with
others in comment threads (emotional congruence)?
4 How are the emotional styles of editors related to those of the
editors they interact more frequently with (emotional
homophily)?
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 21 / 58
Publications
Results published in:
Laniado, D., Castillo, C., Kaltenbrunner, A., and Fuster Morell, M. F. (2012)
Emotions and dialogue in a peer-production community: the case of Wikipedia.
8th International Symposium on Wikis and Open Collaboration, WikiSym’12
Iosub, D., Laniado, D., Castillo, C., Fuster Morell, M. F., and Kaltenbrunner, A. (2014)
Emotions under Discussion: Gender, Status and Communication in Online Collaboration.
Plos One, 9(8)
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 22 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 23 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 24 / 58
Dataset: conversations
Extracting conversations among editors from the English Wikipedia
Articles 3 210 039
Articles with talk page (ATP) 871 485 (27.1%)
Editors who comment articles 350 958
Editors with ≥ 100 comments on ATP 12 231 (3.5%)
Total comments in ATP 11 041 246
Comments containing ANEW words 7 414 411 (67.2%)
Comments by editors with ≥ 100 comments on ATP 5 480 544 (49.6%)
Comments by these editors and with ANEW words 3 649 297 (33.3%)
Table: Data extracted from a complete dump of the English Wikipedia, dated
March 12th, 2010
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 25 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 26 / 58
User gender labelling
≈ 12 000 users wrote ≥ 100 comments in articles talk pages
Gender identified through Wikipedia API for ≈ 2 000 of them
A sample of 1 385 users for manual labelling through
crowdsourcing (Crowdflower)
Non-admins Admins Total
Men 1 087 1 526 2 613
Women 68 97 165
Unknown 6 850 2 603 9 453
Total 8 005 4 226 12 231
Table: Users with ≥ 100 comments by gender and administrator status.
Gender could be identified only for ≈ 50% of users:
real name or username (50% of those identified)
implicitly stated gender (27% of women, 20% of men)
pronoun (15% of women, 10% of men)
other indicators: userboxes, pictures, links to personal blogs...
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 27 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 28 / 58
Measuring the Emotional Content of Discussions
Lexicon-based methods
relying on three different instruments:
Affective norms for English words (ANEW)
Linguistic Inquiry and Word Count (LIWC)
SentiStrength
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 29 / 58
Measuring the Emotional Content of Discussions
Method 1: Affective norms for English words (ANEW)
Rates a list of 1060 frequent words on a 9 point scale in three
dimensions:
Valence
Arousal
Dominance
assign emotion scores to each word from the lexicon
Bradley and Lang. (1999).
Affective norms for English words (ANEW) Technical report C-1.
The Center for Research in Psychophysiology, University of Florida, FL.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 30 / 58
Measuring the Emotional Content of Discussions
Method 2: Linguistic Inquiry and Word Count (LIWC)
Two scores for basic emotion (compared with ANEW valence)
positive emotion
negative emotion
Discrete measures of emotions (anger, anxiety, sadness, affect)
Other classes of words to characterize language (i.e. personal
pronouns, tentative words, fillers...)
→ Count the proportion of words belonging to each class
Pennebaker J, Chung C, Ireland M, Gonzales A, Booth R (2010).
The development and psychometric properties of LIWC2007. Austin, TX.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 31 / 58
Measuring the Emotional Content of Discussions
Method 2: Linguistic Inquiry and Word Count (LIWC)
Dictionary size Examples
Anger 91 hate, kill, annoyed
Anxiety 84 worried, fearful, nervous
Sadness 101 crying, grief, sad
Tentative 155 maybe, perhaps, guess
Certainty 83 always, never
Fillers 9 blah, you know
Past 155 went, ran, had
Present 169 is, does, hear
Future 48 will, gonna
Social words 455 mate, talk, child
Table: Description of LIWC measures (as per http://www.liwc.net).
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 32 / 58
Measuring Relationship-Orientation with LIWC
Definition
Communication that promotes social
affiliation and emotional connection:
preoccupation with others (use of
personal pronouns, e.g., I, you)
preoccupation with the larger social
domain (e.g., references to friends and
family)
expression of positive emotion
Examples
We are glad to have you. If I can help at all let
me know :)
A-giau has smiled at you. Smiles promote
WikiLove and hopefully this one has made
your day better...Happy editing
Congrats! Thank you for your dedication.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 33 / 58
Measuring the Emotional Content of Discussions
Method 3: SentiStrength
SentiStrength
Based on LIWC and developed for short web texts
Accounts for modes of textual expression specific to the online
environment, e.g. emoticons and abbreviations
Provides a positive and a negative score for emotional valence
Emotion score is the strongest positive and negative emotion
expressed in a comment
Final scores are averages over comments in a given category
Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010)
Sentiment strength detection in short informal text.
Journal of the American Society for Information Science and Technology 61: 2544 – 2558.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 34 / 58
Example: results with three different emotional lexica
Table: Example messages with their corresponding Valence(ANEW) or
positive & negative scores (LIWC, SentiStrength)
ANEW LIWC SentiSt.
Valence + - + -
Sounds like a good challenge - to be proven or disproven. I’m
happy if it can be shown to go further using closed cubic poly-
nomial solutions. The nice thing about these are that they are
pretty easy to test numerically . . .
7.4 12.5 0 3 -2
–in “Exact trigonometric constants”
Seems you have not yet seen female lover after having sex
who do not wish to have sex with the same lover any more :)
Once you’ve seen it, you understand very well what war of Venus
means compared to war of Mars.
5.5 6.8 4.5 4 -3
–in “House (astrology)”
What about the whirlie hazing, the alcohol abuse, the emotional
poverty, the suicide in 1995/6, the biotech plans which were
stopped by pitzer protests . . .
1.6 4 8 1 -4
–in “Harvey Mudd College”
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 35 / 58
Sentiment analysis
Statistical tests
Compute average values with the three lexica for each user
Compare distribution of values for two groups of users (e.g.:
admins vs regulars, women vs men)
Most variables are not normally distributed
⇓
Mann-Whitney U-test
Compare distributions of rankings
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 36 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 37 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 38 / 58
Emotions and Status
Table: Emotions and Status: Administrators promote a generally neutral tone
on article talk pages. Regular editors express more negative emotion, and
are more emotional.
(Article Talk) Regular Admin Mann-Whitney U-Test p-value
LIWC
Positive 2.369 2.409 -4.308 p < 0.001
Negative 1.368 1.120 -18.578 p < 0.001
Affect 3.784 3.661 -8.466 p < 0.001
Anxiety 0.180 0.166 -5.834 p < 0.001
Anger 0.554 0.446 -19.217 p < 0.001
Sadness 0.175 0.166 -4.450 p < 0.001
SentiStrength
Positive 1.805 1.774 -14.603 p < 0.001
Negative -2.005 -1.912 -23.046 p < 0.001
When difference is statistically significant (p-value in bold) the larger absolute value is underlined
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 39 / 58
Emotions and Status
Admins:
more positive emotion
(ANEW and LIWC)
generally, emotionally
reserved compared to
regular users (LIWC)
Regular users:
more emotional
more affect, and more
anxiety, anger and
sadness (LIWC)
stronger positive and
negative words than
admins (SentiStrength)
Personal talk pages
In personal talk pages, admins are more emotional compared to
the article talk pages
more positive emotion compared to regular editors, but also more
anxiety and sadness
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 40 / 58
Dialogue and Status
Table: Dialogue and Status: Administrators are more impersonal in article talk
pages. Regular editors are more concerned with others.
(Article Talk) Regular Admin Mann-Whitney U-test p-value
Relationship-orientation
Personal pronouns 5.135 4.815 -13.561 p < 0.001
Use of “I” 2.456 2.429 -1.733 p=0.083
Use of “You” 1.043 0.892 -12.573 p < 0.001
Use of “Shehe” 0.609 0.526 -8.657 p < 0.001
Social words 6.320 5.810 -19.013 p < 0.001
Certainty
Certainty 1.426 1.317 -16.824 p < 0.001
Tentativeness 3.199 3.169 -2.210 p < 0.001
Filler words 0.168 0.155 -6.687 p < 0.001
Temporal Orientation
Past 2.376 2.305 -5.696 p < 0.001
Present 8.011 7.841 -8.060 p < 0.001
Future 1.114 1.166 -9.887 p < 0.001
When difference is statistically significant (p-value in bold) the larger absolute value is underlined
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 41 / 58
Dialogue and Status
Admins
more neutral and
impersonal tone
less relationship oriented
more concerned with the
future
tend to "rule with reason"
Regular users
more relationship-oriented
more personal pronouns
and more social words
more concerned with past
more insecure, but not in
personal spaces
more certainty, tentative
and filler words in article
talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 42 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 43 / 58
Emotions and gender
ANEW Words more used by women and men
Size accounts for difference in frequency
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 44 / 58
Emotions and gender
Women use words associated to more positive emotions
Result consistent and significant with the three lexicons
ANEW: Difference is not significant when normalising by article
→ difference might be due to topic selection: women choose to
participate in topics which have more positive discussions
No significant difference in expression of negative emotions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 45 / 58
Topics, emotions and gender
N≥1 ANEW words; corr=−0.64 (p=0.002)
prop. of male comments
meanvalence
0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96
4.7
4.8
4.9
5
5.1
5.2
5.3
5.4
5.5
5.6
Computing
Arts
Philosophy
Language
Health
Mathematics
Belief
Sports
Agriculture
Environment
Techn. & app. sci.
Law
Society
Business
Education
Culture
People
Science
Politics
Geography and places
History and events
Figure: Mean valence (ANEW) for discussions of articles in different topic
categories, vs the proportion of comments written by men
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 46 / 58
Dialogue and gender
Table: Dialogue and Gender: Women use a more relationship-oriented
speech style.
(Article Talk) Men Women Mann-Whitney U-test p-value
Relationship-orientation
Personal pronouns 4.964 5.420 -4.375 p < 0.001
Use of “I” 2.488 2.764 -3.945 p < 0.001
Use of “You” 0.936 0.957 -0.926 p=0.355
Use of “Shehe” pronouns 0.541 0.713 -4.657 p < 0.001
Social words 5.960 6.353 -3.487 p < 0.001
Certainty
Certainty 1.346 (1397) 1.300 (1263) -2.078 p = 0.038*
Tentativeness 3.150 3.215 -1.162 p=0.245
Filler words 0.161 0.160 -0.137 p=0.891
Temporal Orientation
Past 2.325 2.543 -4.305 p < 0.001
Present 7.897 8.180 -3.086 p = 0.002
Future 1.168 1.147 -1.008 p=0.314
When difference is statistically significant (p-value in bold) the larger absolute value is
underlined. Cases where the averages are not informative are marked with an asterisk * and
include the mean ranks Mann-Whitney U-test next to the averages in parentheses.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 47 / 58
Dialogue and Gender
Women write longer messages
Women are more relationship-oriented
more personal pronouns, in particular “I”, more social words
Women are not more insecure
Less certainty words, no significant difference for tentativeness and
filler words
Women admins are more relationship oriented than men admins
Different leadership style
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 48 / 58
Qualitative analysis: Relationship orientation
Manual classification of 100 comments
Three main types of comments high in relationship-orientation:
inviting comments that explain the edit in a friendly tone, and call for
further intervention and collaboration
common perspective-building comments that are focused on
understanding others and solving debates in a constructive manner
appreciative comments that contain positive emotions and
celebrate others’ actions
⇒ This suggests that relationship-orientation may be conducive to
successful collaboration
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 49 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 50 / 58
Emotional congruence
Comparison of each comment with the comment it replies to
not based only on our set of users, but on all comments (from all
users)
Emotions: editors tend to reply with:
more positive emotion
less negative emotion
less anger, anxiety and sadness
stronger words, both positive and negative (SentiStrength)
Dialogue: editors tend to reply with:
more relationship oriented speech
less tentative and certainty words
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 51 / 58
Emotional homophily
Mixing patterns: do users interact preferentially with similar users?
Disassortativity by activity
users who write more comments tend to reply preferentially to less
active users, and viceversa
Assortativity by gender
Men interact more with other men, and women with other women
Assortativity by emotion and language
Users interact more with others similar in emotional expression and
communication style
also in the network of communication on personal talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 52 / 58
Emotional homophily
Example: homophily by expression of anger
edges connect users who
have exchanged at least 10
replies
node color represents the
level of anger expressed by a
user, from low to high
node size → proportional to
the number of connections of
a user
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 53 / 58
Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 54 / 58
Conclusions
Administrators and experienced users play a pivotal role
they tend to interact especially with less experienced users
they promote a positive but impersonal environment
Men and women have a different communication style
women participate in discussions that have a more positive tone
men interact more with men, and women with women
women use a more emotional and relationship-oriented language
women admins have a relationship-oriented leadership style
⇒ promoting relationship-orientated leadership could lead to a more
positive environment
⇒ giving women more space in the community could result in a more
welcoming envoronment, for both women and men
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 55 / 58
Future work
Longitudinal analysis
how do emotional styles of editors change over time and with
increasing experience?
how do emotions in the discussions affect participation?
Qualitative analysis and human annotation
include non-textual emotional aspects such as emoticons, barn
stars and virtual gifts
deal with sarcasm, measure the extent of condescending or
paternalistic language in comments addressed at women editors
Examine other online spaces
Similar conclusions might hold for other online spaces
especially in discussions involving conflict, decision making and
power dynamics
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 56 / 58
Some references
M. M. Bradley and P. J .Lang.
Affective norms for English words (ANEW) Technical report C-1.
The Center for Research in Psychophysiology, University of Florida, FL, 2012.
B. Collier and J. Bear.
Conflict, criticism, or confidence: an empirical examination of the gender gap in wikipedia contributions.
In Proc. of CSCW, 2012.
Eom, Y.H., Aragón, P., Laniado, D., Kaltenbrunner, A., Vigna, S., Shepelyansky, D.L.: Interactions of cultures and top
people of wikipedia from ranking of 24 language editions.
PLoS ONE 10(3), e0114,825 (2015).
Iosub, D., Laniado, D., Castillo, C., Fuster Morell, M. F., and Kaltenbrunner, A. (2014)
Emotions under Discussion: Gender, Status and Communication in Online Collaboration.
Plos One, 9(8)
O. Kucuktunc, B. B. Cambazoglu, I. Weber, and H. Ferhatosmanoglu.
A large-scale sentiment analysis for Yahoo! answers.
In Proc. of WSDM, 2012.
D. Laniado, R. Tasso, Y. Volkovich, and A. Kaltenbrunner.
When the Wikipedians talk: Network and tree structure of Wikipedia discussion pages.
In Proc. of ICWSM, 2011.
Laniado, D., Castillo, C., Kaltenbrunner, A., and Fuster Morell, M. F. (2012)
Emotions and dialogue in a peer-production community: the case of Wikipedia.
8th International Symposium on Wikis and Open Collaboration, WikiSym’12
H. Zhu, R. Kraut, A. Kittur
Effectiveness of shared leadership in online communities.
In Proc. of CSCW, 2012.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 57 / 58
Questions?
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 58 / 58

More Related Content

Similar to Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions

Emotions under Discussion: Gender, Status and Communication in Wikipedia
Emotions under Discussion: Gender, Status and Communication in WikipediaEmotions under Discussion: Gender, Status and Communication in Wikipedia
Emotions under Discussion: Gender, Status and Communication in Wikipedia
David Laniado
 
Role of Language in Maintaining Ethnic Identity
Role of Language in Maintaining Ethnic IdentityRole of Language in Maintaining Ethnic Identity
Role of Language in Maintaining Ethnic Identity
paperpublications3
 
Livemocha Vs[1][1][1]
Livemocha Vs[1][1][1]Livemocha Vs[1][1][1]
Livemocha Vs[1][1][1]
Lucía Montás Rios
 
Visualizing social interactions in Wikipedia - WikiCorp 2018
Visualizing social interactions in Wikipedia - WikiCorp 2018Visualizing social interactions in Wikipedia - WikiCorp 2018
Visualizing social interactions in Wikipedia - WikiCorp 2018
David Laniado
 
The Networked Librarian: Libraries as social networks
The Networked Librarian: Libraries as social networksThe Networked Librarian: Libraries as social networks
The Networked Librarian: Libraries as social networks
Pew Research Center's Internet & American Life Project
 
NATIONAL FORUM JOURNALS - www.nationalforum.com
NATIONAL FORUM JOURNALS - www.nationalforum.comNATIONAL FORUM JOURNALS - www.nationalforum.com
NATIONAL FORUM JOURNALS - www.nationalforum.com
William Kritsonis
 
Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...
Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...
Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...
William Kritsonis
 
Scholars imagined audiences
Scholars imagined audiencesScholars imagined audiences
Scholars imagined audiences
George Veletsianos
 
There Goes Everybody: Social Media and Civic Engagement
There Goes Everybody: Social Media and Civic EngagementThere Goes Everybody: Social Media and Civic Engagement
There Goes Everybody: Social Media and Civic Engagement
Minnesota Campus Comapct
 
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
Sina Institute
 
Web 2.0 An Introduction
Web 2.0   An IntroductionWeb 2.0   An Introduction
Web 2.0 An Introduction
Vincenzo Cosenza
 
Teaching NonFiction in a Digital Space
Teaching NonFiction in a Digital SpaceTeaching NonFiction in a Digital Space
Teaching NonFiction in a Digital Space
Angela Maiers
 
Open and Participatory Environments in Language Learning
Open and Participatory Environments in Language LearningOpen and Participatory Environments in Language Learning
Open and Participatory Environments in Language Learning
Barbara Dieu
 
Look Sharp 2011 - Monday am
Look Sharp 2011 - Monday amLook Sharp 2011 - Monday am
Look Sharp 2011 - Monday am
Roger Sevilla
 
Dean R Berry Close Reading: Social Networking Issues
Dean R Berry Close Reading:  Social Networking IssuesDean R Berry Close Reading:  Social Networking Issues
Dean R Berry Close Reading: Social Networking Issues
Riverside County Office of Education
 
Reading 2.0
Reading 2.0Reading 2.0
Going Social: What You Need to Know to Launch a Social Media Strategy
Going Social: What You Need to Know to Launch a Social Media StrategyGoing Social: What You Need to Know to Launch a Social Media Strategy
Going Social: What You Need to Know to Launch a Social Media Strategy
Jim Rattray
 
Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...
Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...
Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...
Language and Culture Learning Center
 
What increases (social) media attention: Research impact, author prominence o...
What increases (social) media attention: Research impact, author prominence o...What increases (social) media attention: Research impact, author prominence o...
What increases (social) media attention: Research impact, author prominence o...
Olga Zagovora
 
21st Century Skills: What do Adult Educators Need to Know?
21st Century Skills: What do Adult Educators Need to Know?21st Century Skills: What do Adult Educators Need to Know?
21st Century Skills: What do Adult Educators Need to Know?
Marian Thacher
 

Similar to Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions (20)

Emotions under Discussion: Gender, Status and Communication in Wikipedia
Emotions under Discussion: Gender, Status and Communication in WikipediaEmotions under Discussion: Gender, Status and Communication in Wikipedia
Emotions under Discussion: Gender, Status and Communication in Wikipedia
 
Role of Language in Maintaining Ethnic Identity
Role of Language in Maintaining Ethnic IdentityRole of Language in Maintaining Ethnic Identity
Role of Language in Maintaining Ethnic Identity
 
Livemocha Vs[1][1][1]
Livemocha Vs[1][1][1]Livemocha Vs[1][1][1]
Livemocha Vs[1][1][1]
 
Visualizing social interactions in Wikipedia - WikiCorp 2018
Visualizing social interactions in Wikipedia - WikiCorp 2018Visualizing social interactions in Wikipedia - WikiCorp 2018
Visualizing social interactions in Wikipedia - WikiCorp 2018
 
The Networked Librarian: Libraries as social networks
The Networked Librarian: Libraries as social networksThe Networked Librarian: Libraries as social networks
The Networked Librarian: Libraries as social networks
 
NATIONAL FORUM JOURNALS - www.nationalforum.com
NATIONAL FORUM JOURNALS - www.nationalforum.comNATIONAL FORUM JOURNALS - www.nationalforum.com
NATIONAL FORUM JOURNALS - www.nationalforum.com
 
Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...
Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...
Texas A&M University-Commerce - Authors Published by NATIONAL FORUM JOURNALS ...
 
Scholars imagined audiences
Scholars imagined audiencesScholars imagined audiences
Scholars imagined audiences
 
There Goes Everybody: Social Media and Civic Engagement
There Goes Everybody: Social Media and Civic EngagementThere Goes Everybody: Social Media and Civic Engagement
There Goes Everybody: Social Media and Civic Engagement
 
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...
 
Web 2.0 An Introduction
Web 2.0   An IntroductionWeb 2.0   An Introduction
Web 2.0 An Introduction
 
Teaching NonFiction in a Digital Space
Teaching NonFiction in a Digital SpaceTeaching NonFiction in a Digital Space
Teaching NonFiction in a Digital Space
 
Open and Participatory Environments in Language Learning
Open and Participatory Environments in Language LearningOpen and Participatory Environments in Language Learning
Open and Participatory Environments in Language Learning
 
Look Sharp 2011 - Monday am
Look Sharp 2011 - Monday amLook Sharp 2011 - Monday am
Look Sharp 2011 - Monday am
 
Dean R Berry Close Reading: Social Networking Issues
Dean R Berry Close Reading:  Social Networking IssuesDean R Berry Close Reading:  Social Networking Issues
Dean R Berry Close Reading: Social Networking Issues
 
Reading 2.0
Reading 2.0Reading 2.0
Reading 2.0
 
Going Social: What You Need to Know to Launch a Social Media Strategy
Going Social: What You Need to Know to Launch a Social Media StrategyGoing Social: What You Need to Know to Launch a Social Media Strategy
Going Social: What You Need to Know to Launch a Social Media Strategy
 
Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...
Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...
Language Symposium 2012: Developing Task-Based Social Networking Projects: Th...
 
What increases (social) media attention: Research impact, author prominence o...
What increases (social) media attention: Research impact, author prominence o...What increases (social) media attention: Research impact, author prominence o...
What increases (social) media attention: Research impact, author prominence o...
 
21st Century Skills: What do Adult Educators Need to Know?
21st Century Skills: What do Adult Educators Need to Know?21st Century Skills: What do Adult Educators Need to Know?
21st Century Skills: What do Adult Educators Need to Know?
 

More from David Laniado

Wikipedia Cultural Diversity Dataset - ICWSM 2019
Wikipedia Cultural Diversity Dataset  - ICWSM 2019Wikipedia Cultural Diversity Dataset  - ICWSM 2019
Wikipedia Cultural Diversity Dataset - ICWSM 2019
David Laniado
 
BarcelonaNow dashboard showcase
BarcelonaNow dashboard showcaseBarcelonaNow dashboard showcase
BarcelonaNow dashboard showcase
David Laniado
 
Contropedia: Critical learning through Wikipedia's edit history
Contropedia: Critical learning through Wikipedia's edit historyContropedia: Critical learning through Wikipedia's edit history
Contropedia: Critical learning through Wikipedia's edit history
David Laniado
 
Gender patterns on a large social network (SocInfo 2014)
Gender patterns on a large social network (SocInfo 2014)Gender patterns on a large social network (SocInfo 2014)
Gender patterns on a large social network (SocInfo 2014)
David Laniado
 
Dinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y Roles
Dinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y RolesDinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y Roles
Dinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y Roles
David Laniado
 
Emotions and dialogue in a peer-production community: the case of Wikipedia
Emotions and dialogue in a peer-production community: the case of WikipediaEmotions and dialogue in a peer-production community: the case of Wikipedia
Emotions and dialogue in a peer-production community: the case of Wikipedia
David Laniado
 
When the Wikipedians talk: network and tree structure of Wikipedia discussion...
When the Wikipedians talk: network and tree structure of Wikipedia discussion...When the Wikipedians talk: network and tree structure of Wikipedia discussion...
When the Wikipedians talk: network and tree structure of Wikipedia discussion...
David Laniado
 

More from David Laniado (7)

Wikipedia Cultural Diversity Dataset - ICWSM 2019
Wikipedia Cultural Diversity Dataset  - ICWSM 2019Wikipedia Cultural Diversity Dataset  - ICWSM 2019
Wikipedia Cultural Diversity Dataset - ICWSM 2019
 
BarcelonaNow dashboard showcase
BarcelonaNow dashboard showcaseBarcelonaNow dashboard showcase
BarcelonaNow dashboard showcase
 
Contropedia: Critical learning through Wikipedia's edit history
Contropedia: Critical learning through Wikipedia's edit historyContropedia: Critical learning through Wikipedia's edit history
Contropedia: Critical learning through Wikipedia's edit history
 
Gender patterns on a large social network (SocInfo 2014)
Gender patterns on a large social network (SocInfo 2014)Gender patterns on a large social network (SocInfo 2014)
Gender patterns on a large social network (SocInfo 2014)
 
Dinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y Roles
Dinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y RolesDinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y Roles
Dinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y Roles
 
Emotions and dialogue in a peer-production community: the case of Wikipedia
Emotions and dialogue in a peer-production community: the case of WikipediaEmotions and dialogue in a peer-production community: the case of Wikipedia
Emotions and dialogue in a peer-production community: the case of Wikipedia
 
When the Wikipedians talk: network and tree structure of Wikipedia discussion...
When the Wikipedians talk: network and tree structure of Wikipedia discussion...When the Wikipedians talk: network and tree structure of Wikipedia discussion...
When the Wikipedians talk: network and tree structure of Wikipedia discussion...
 

Recently uploaded

一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 
Template xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptxTemplate xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptx
TeukuEriSyahputra
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
asyed10
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
vasanthatpuram
 
Jio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdfJio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdf
inaya7568
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
ywqeos
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
exukyp
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
aguty
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
Vineet
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 

Recently uploaded (20)

一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 
Template xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptxTemplate xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptx
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
 
Jio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdfJio cinema Retention & Engagement Strategy.pdf
Jio cinema Retention & Engagement Strategy.pdf
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
UofT毕业证如何办理
UofT毕业证如何办理UofT毕业证如何办理
UofT毕业证如何办理
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 

Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions

  • 1. Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions David Laniado, Daniela Iosub, Carlos Castillo, Mayo Fuster Morell and Andreas Kaltenbrunner david.laniado@eurecat.org Universitat Pompeu Fabra, January 17, 2017 David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 1 / 58
  • 2. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 2 / 58
  • 3. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 3 / 58
  • 4. Wikipedia is a teenager Happy birthday Wikipedia! English Wikipedia is now 16 years old Catalan Wikipedia will be 16 in March (the second oldest one) David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 4 / 58
  • 5. The largest human knowledge repository Fifth most visited web site Among top results for search queries about almost any topic Largest collaborative project Conditions and reflects public opinion... with some bias David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 5 / 58
  • 6. Wikipedia’s social experiment "The problem with Wikipedia is that it only works in practice. In theory, it can never work." (Wikipedian popular joke) A crazy idea: anyone can edit All relevant points of view should be represented Policies of Notability and Neutral Point of view Quality assured by editors’ negotiations over content The more people with different points of view contributing, the better the quality → Biases in the editor community may cause biases in the content David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 6 / 58
  • 7. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 7 / 58
  • 8. Bias in the content? Top global biographies by birth country (Young-Ho Eom et al, 2015) top central biographies from each of the 24 major Wikipedias the 100 most central (PageRank) in each version’s hyperlink network → striking geographic bias http://www.quantware.ups-tlse.fr/QWLIB/topwikipeople/geofigs/pagerank24x100.html David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 8 / 58
  • 9. Top Women Biographies Rank NA PageRank female figures CC Century LC 1 24 Elizabeth II UK 20 EN 2 17 Mary (mother of Jesus) IL -1 HE 3 12 Queen Victoria UK 19 EN 4 6 Elizabeth I of England UK 16 EN 5 2 Maria Theresa AT 18 DE 6 1 Benazir Bhutto PK 20 HI 7 1 Catherine the Great PL 18 PL 8 1 Anne Frank DE 20 DE 9 1 Indira Gandhi IN 20 HI 10 1 Margrethe II of Denmark DK 20 DA Top 10 global female historical figures by PageRank for the 24 major Wikipedia editions (Young-Ho Eom et al, 2015) NA → number of language editions in which a biography appears in the top 100 rank CC → birth country code LC → language code corresponding to the birth country David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 9 / 58
  • 10. Top biographies by gender Number of women among the top global biographies by birth century Number of women among the 100 most central biographies for each language edition David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 10 / 58
  • 11. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 11 / 58
  • 12. Wikipedia editor gender gap Estimated women participation 2011 editor survey: 9% 2013 editor survey: 13% corrected with propensity score estimation: 16% (Mako Hill and Shaw, 2013) while in most online social networks women are more active! David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 12 / 58
  • 13. Why women do not edit Wikipedia? 1 A lack of user-friendliness in the editing interface 2 Not having enough free time 3 A lack of self-confidence 4 Aversion to conflict and an unwillingness to participate in lengthy edit wars 5 Belief that their contributions are too likely to be reverted or deleted 6 Some find its overall atmosphere misogynistic 7 Wikipedia culture is sexual in ways they find off-putting 8 Being addressed as male is off-putting to women whose primary language has grammatical gender 9 Fewer opportunities than other sites for social relationships and a welcoming tone (Sue Gardener, 2011) David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 13 / 58
  • 14. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 14 / 58
  • 15. Emotional factors and discussions Importance of emotional factors Discussion spaces are fundamental to the collaborative process Discussion triggers emotions and breeds particular emotional environments David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 15 / 58
  • 16. Wikipedia’s most visible side David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 16 / 58
  • 17. Article talk pages David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 17 / 58
  • 18. Discussions in article talk pages David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 18 / 58
  • 19. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 19 / 58
  • 20. Studying emotions and language in talk pages Interactions in Wikipedia Implicit → editing Explicit → communication Article talk pages → discussions about how to improve articles User talk pages → a kind of public in-boxes Goal: Shed light on the emotional dimension of the interactions extensive analysis of emotions in explicit communication sentiment analysis of comments in article talk and personal talk pages David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 20 / 58
  • 21. Research questions 1 How are the emotional and communication styles of editors affected by their status? 2 How are the emotional and communication styles of editors affected by their gender? 3 How are the emotional expressions affected by interacting with others in comment threads (emotional congruence)? 4 How are the emotional styles of editors related to those of the editors they interact more frequently with (emotional homophily)? David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 21 / 58
  • 22. Publications Results published in: Laniado, D., Castillo, C., Kaltenbrunner, A., and Fuster Morell, M. F. (2012) Emotions and dialogue in a peer-production community: the case of Wikipedia. 8th International Symposium on Wikis and Open Collaboration, WikiSym’12 Iosub, D., Laniado, D., Castillo, C., Fuster Morell, M. F., and Kaltenbrunner, A. (2014) Emotions under Discussion: Gender, Status and Communication in Online Collaboration. Plos One, 9(8) David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 22 / 58
  • 23. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 23 / 58
  • 24. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 24 / 58
  • 25. Dataset: conversations Extracting conversations among editors from the English Wikipedia Articles 3 210 039 Articles with talk page (ATP) 871 485 (27.1%) Editors who comment articles 350 958 Editors with ≥ 100 comments on ATP 12 231 (3.5%) Total comments in ATP 11 041 246 Comments containing ANEW words 7 414 411 (67.2%) Comments by editors with ≥ 100 comments on ATP 5 480 544 (49.6%) Comments by these editors and with ANEW words 3 649 297 (33.3%) Table: Data extracted from a complete dump of the English Wikipedia, dated March 12th, 2010 David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 25 / 58
  • 26. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 26 / 58
  • 27. User gender labelling ≈ 12 000 users wrote ≥ 100 comments in articles talk pages Gender identified through Wikipedia API for ≈ 2 000 of them A sample of 1 385 users for manual labelling through crowdsourcing (Crowdflower) Non-admins Admins Total Men 1 087 1 526 2 613 Women 68 97 165 Unknown 6 850 2 603 9 453 Total 8 005 4 226 12 231 Table: Users with ≥ 100 comments by gender and administrator status. Gender could be identified only for ≈ 50% of users: real name or username (50% of those identified) implicitly stated gender (27% of women, 20% of men) pronoun (15% of women, 10% of men) other indicators: userboxes, pictures, links to personal blogs... David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 27 / 58
  • 28. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 28 / 58
  • 29. Measuring the Emotional Content of Discussions Lexicon-based methods relying on three different instruments: Affective norms for English words (ANEW) Linguistic Inquiry and Word Count (LIWC) SentiStrength David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 29 / 58
  • 30. Measuring the Emotional Content of Discussions Method 1: Affective norms for English words (ANEW) Rates a list of 1060 frequent words on a 9 point scale in three dimensions: Valence Arousal Dominance assign emotion scores to each word from the lexicon Bradley and Lang. (1999). Affective norms for English words (ANEW) Technical report C-1. The Center for Research in Psychophysiology, University of Florida, FL. David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 30 / 58
  • 31. Measuring the Emotional Content of Discussions Method 2: Linguistic Inquiry and Word Count (LIWC) Two scores for basic emotion (compared with ANEW valence) positive emotion negative emotion Discrete measures of emotions (anger, anxiety, sadness, affect) Other classes of words to characterize language (i.e. personal pronouns, tentative words, fillers...) → Count the proportion of words belonging to each class Pennebaker J, Chung C, Ireland M, Gonzales A, Booth R (2010). The development and psychometric properties of LIWC2007. Austin, TX. David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 31 / 58
  • 32. Measuring the Emotional Content of Discussions Method 2: Linguistic Inquiry and Word Count (LIWC) Dictionary size Examples Anger 91 hate, kill, annoyed Anxiety 84 worried, fearful, nervous Sadness 101 crying, grief, sad Tentative 155 maybe, perhaps, guess Certainty 83 always, never Fillers 9 blah, you know Past 155 went, ran, had Present 169 is, does, hear Future 48 will, gonna Social words 455 mate, talk, child Table: Description of LIWC measures (as per http://www.liwc.net). David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 32 / 58
  • 33. Measuring Relationship-Orientation with LIWC Definition Communication that promotes social affiliation and emotional connection: preoccupation with others (use of personal pronouns, e.g., I, you) preoccupation with the larger social domain (e.g., references to friends and family) expression of positive emotion Examples We are glad to have you. If I can help at all let me know :) A-giau has smiled at you. Smiles promote WikiLove and hopefully this one has made your day better...Happy editing Congrats! Thank you for your dedication. David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 33 / 58
  • 34. Measuring the Emotional Content of Discussions Method 3: SentiStrength SentiStrength Based on LIWC and developed for short web texts Accounts for modes of textual expression specific to the online environment, e.g. emoticons and abbreviations Provides a positive and a negative score for emotional valence Emotion score is the strongest positive and negative emotion expressed in a comment Final scores are averages over comments in a given category Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010) Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology 61: 2544 – 2558. David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 34 / 58
  • 35. Example: results with three different emotional lexica Table: Example messages with their corresponding Valence(ANEW) or positive & negative scores (LIWC, SentiStrength) ANEW LIWC SentiSt. Valence + - + - Sounds like a good challenge - to be proven or disproven. I’m happy if it can be shown to go further using closed cubic poly- nomial solutions. The nice thing about these are that they are pretty easy to test numerically . . . 7.4 12.5 0 3 -2 –in “Exact trigonometric constants” Seems you have not yet seen female lover after having sex who do not wish to have sex with the same lover any more :) Once you’ve seen it, you understand very well what war of Venus means compared to war of Mars. 5.5 6.8 4.5 4 -3 –in “House (astrology)” What about the whirlie hazing, the alcohol abuse, the emotional poverty, the suicide in 1995/6, the biotech plans which were stopped by pitzer protests . . . 1.6 4 8 1 -4 –in “Harvey Mudd College” David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 35 / 58
  • 36. Sentiment analysis Statistical tests Compute average values with the three lexica for each user Compare distribution of values for two groups of users (e.g.: admins vs regulars, women vs men) Most variables are not normally distributed ⇓ Mann-Whitney U-test Compare distributions of rankings David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 36 / 58
  • 37. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 37 / 58
  • 38. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 38 / 58
  • 39. Emotions and Status Table: Emotions and Status: Administrators promote a generally neutral tone on article talk pages. Regular editors express more negative emotion, and are more emotional. (Article Talk) Regular Admin Mann-Whitney U-Test p-value LIWC Positive 2.369 2.409 -4.308 p < 0.001 Negative 1.368 1.120 -18.578 p < 0.001 Affect 3.784 3.661 -8.466 p < 0.001 Anxiety 0.180 0.166 -5.834 p < 0.001 Anger 0.554 0.446 -19.217 p < 0.001 Sadness 0.175 0.166 -4.450 p < 0.001 SentiStrength Positive 1.805 1.774 -14.603 p < 0.001 Negative -2.005 -1.912 -23.046 p < 0.001 When difference is statistically significant (p-value in bold) the larger absolute value is underlined David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 39 / 58
  • 40. Emotions and Status Admins: more positive emotion (ANEW and LIWC) generally, emotionally reserved compared to regular users (LIWC) Regular users: more emotional more affect, and more anxiety, anger and sadness (LIWC) stronger positive and negative words than admins (SentiStrength) Personal talk pages In personal talk pages, admins are more emotional compared to the article talk pages more positive emotion compared to regular editors, but also more anxiety and sadness David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 40 / 58
  • 41. Dialogue and Status Table: Dialogue and Status: Administrators are more impersonal in article talk pages. Regular editors are more concerned with others. (Article Talk) Regular Admin Mann-Whitney U-test p-value Relationship-orientation Personal pronouns 5.135 4.815 -13.561 p < 0.001 Use of “I” 2.456 2.429 -1.733 p=0.083 Use of “You” 1.043 0.892 -12.573 p < 0.001 Use of “Shehe” 0.609 0.526 -8.657 p < 0.001 Social words 6.320 5.810 -19.013 p < 0.001 Certainty Certainty 1.426 1.317 -16.824 p < 0.001 Tentativeness 3.199 3.169 -2.210 p < 0.001 Filler words 0.168 0.155 -6.687 p < 0.001 Temporal Orientation Past 2.376 2.305 -5.696 p < 0.001 Present 8.011 7.841 -8.060 p < 0.001 Future 1.114 1.166 -9.887 p < 0.001 When difference is statistically significant (p-value in bold) the larger absolute value is underlined David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 41 / 58
  • 42. Dialogue and Status Admins more neutral and impersonal tone less relationship oriented more concerned with the future tend to "rule with reason" Regular users more relationship-oriented more personal pronouns and more social words more concerned with past more insecure, but not in personal spaces more certainty, tentative and filler words in article talk pages David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 42 / 58
  • 43. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 43 / 58
  • 44. Emotions and gender ANEW Words more used by women and men Size accounts for difference in frequency David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 44 / 58
  • 45. Emotions and gender Women use words associated to more positive emotions Result consistent and significant with the three lexicons ANEW: Difference is not significant when normalising by article → difference might be due to topic selection: women choose to participate in topics which have more positive discussions No significant difference in expression of negative emotions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 45 / 58
  • 46. Topics, emotions and gender N≥1 ANEW words; corr=−0.64 (p=0.002) prop. of male comments meanvalence 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 4.7 4.8 4.9 5 5.1 5.2 5.3 5.4 5.5 5.6 Computing Arts Philosophy Language Health Mathematics Belief Sports Agriculture Environment Techn. & app. sci. Law Society Business Education Culture People Science Politics Geography and places History and events Figure: Mean valence (ANEW) for discussions of articles in different topic categories, vs the proportion of comments written by men David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 46 / 58
  • 47. Dialogue and gender Table: Dialogue and Gender: Women use a more relationship-oriented speech style. (Article Talk) Men Women Mann-Whitney U-test p-value Relationship-orientation Personal pronouns 4.964 5.420 -4.375 p < 0.001 Use of “I” 2.488 2.764 -3.945 p < 0.001 Use of “You” 0.936 0.957 -0.926 p=0.355 Use of “Shehe” pronouns 0.541 0.713 -4.657 p < 0.001 Social words 5.960 6.353 -3.487 p < 0.001 Certainty Certainty 1.346 (1397) 1.300 (1263) -2.078 p = 0.038* Tentativeness 3.150 3.215 -1.162 p=0.245 Filler words 0.161 0.160 -0.137 p=0.891 Temporal Orientation Past 2.325 2.543 -4.305 p < 0.001 Present 7.897 8.180 -3.086 p = 0.002 Future 1.168 1.147 -1.008 p=0.314 When difference is statistically significant (p-value in bold) the larger absolute value is underlined. Cases where the averages are not informative are marked with an asterisk * and include the mean ranks Mann-Whitney U-test next to the averages in parentheses. David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 47 / 58
  • 48. Dialogue and Gender Women write longer messages Women are more relationship-oriented more personal pronouns, in particular “I”, more social words Women are not more insecure Less certainty words, no significant difference for tentativeness and filler words Women admins are more relationship oriented than men admins Different leadership style David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 48 / 58
  • 49. Qualitative analysis: Relationship orientation Manual classification of 100 comments Three main types of comments high in relationship-orientation: inviting comments that explain the edit in a friendly tone, and call for further intervention and collaboration common perspective-building comments that are focused on understanding others and solving debates in a constructive manner appreciative comments that contain positive emotions and celebrate others’ actions ⇒ This suggests that relationship-orientation may be conducive to successful collaboration David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 49 / 58
  • 50. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 50 / 58
  • 51. Emotional congruence Comparison of each comment with the comment it replies to not based only on our set of users, but on all comments (from all users) Emotions: editors tend to reply with: more positive emotion less negative emotion less anger, anxiety and sadness stronger words, both positive and negative (SentiStrength) Dialogue: editors tend to reply with: more relationship oriented speech less tentative and certainty words David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 51 / 58
  • 52. Emotional homophily Mixing patterns: do users interact preferentially with similar users? Disassortativity by activity users who write more comments tend to reply preferentially to less active users, and viceversa Assortativity by gender Men interact more with other men, and women with other women Assortativity by emotion and language Users interact more with others similar in emotional expression and communication style also in the network of communication on personal talk pages David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 52 / 58
  • 53. Emotional homophily Example: homophily by expression of anger edges connect users who have exchanged at least 10 replies node color represents the level of anger expressed by a user, from low to high node size → proportional to the number of connections of a user David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 53 / 58
  • 54. Outline 1 Introduction Wikipedia content biases Wikipedia gender gap Wikipedia discussion spaces Goal and research questions 2 Framework of analysis Data acquisition and pre-processing User gender labelling Language and sentiment analysis 3 Results Emotions and status Emotions and gender Networked emotions 4 Conclusions David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 54 / 58
  • 55. Conclusions Administrators and experienced users play a pivotal role they tend to interact especially with less experienced users they promote a positive but impersonal environment Men and women have a different communication style women participate in discussions that have a more positive tone men interact more with men, and women with women women use a more emotional and relationship-oriented language women admins have a relationship-oriented leadership style ⇒ promoting relationship-orientated leadership could lead to a more positive environment ⇒ giving women more space in the community could result in a more welcoming envoronment, for both women and men David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 55 / 58
  • 56. Future work Longitudinal analysis how do emotional styles of editors change over time and with increasing experience? how do emotions in the discussions affect participation? Qualitative analysis and human annotation include non-textual emotional aspects such as emoticons, barn stars and virtual gifts deal with sarcasm, measure the extent of condescending or paternalistic language in comments addressed at women editors Examine other online spaces Similar conclusions might hold for other online spaces especially in discussions involving conflict, decision making and power dynamics David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 56 / 58
  • 57. Some references M. M. Bradley and P. J .Lang. Affective norms for English words (ANEW) Technical report C-1. The Center for Research in Psychophysiology, University of Florida, FL, 2012. B. Collier and J. Bear. Conflict, criticism, or confidence: an empirical examination of the gender gap in wikipedia contributions. In Proc. of CSCW, 2012. Eom, Y.H., Aragón, P., Laniado, D., Kaltenbrunner, A., Vigna, S., Shepelyansky, D.L.: Interactions of cultures and top people of wikipedia from ranking of 24 language editions. PLoS ONE 10(3), e0114,825 (2015). Iosub, D., Laniado, D., Castillo, C., Fuster Morell, M. F., and Kaltenbrunner, A. (2014) Emotions under Discussion: Gender, Status and Communication in Online Collaboration. Plos One, 9(8) O. Kucuktunc, B. B. Cambazoglu, I. Weber, and H. Ferhatosmanoglu. A large-scale sentiment analysis for Yahoo! answers. In Proc. of WSDM, 2012. D. Laniado, R. Tasso, Y. Volkovich, and A. Kaltenbrunner. When the Wikipedians talk: Network and tree structure of Wikipedia discussion pages. In Proc. of ICWSM, 2011. Laniado, D., Castillo, C., Kaltenbrunner, A., and Fuster Morell, M. F. (2012) Emotions and dialogue in a peer-production community: the case of Wikipedia. 8th International Symposium on Wikis and Open Collaboration, WikiSym’12 H. Zhu, R. Kraut, A. Kittur Effectiveness of shared leadership in online communities. In Proc. of CSCW, 2012. David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 57 / 58
  • 58. Questions? David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 58 / 58