Slides presented at UPF:
https://www.upf.edu/web/mdm-dtic/gender-and-wikipedia_2017
https://www.upf.edu/en/web/guest/home/-/asset_publisher/UI8Z8VAxU47P/content/id/7282941/maximized#.WIjEE2dA-Ba
The presentation focuses on two studies that investigate differences in language used by men and women on Wikipedia talk pages. Automatic message analysis reveals that women participate more in discussions that have a positive tone, and use a language that promotes more relationship and emotional connection compared to men. We also observe a gender difference in the leadership style: while men administrators tend to maintain an impersonal tone compared to other users, women administrators are indistinguishable from other women, and use a markedly emotional and relationship-oriented language. The results suggest the importance of communication style to address gender gap in online collaboration platforms, and to favor more welcoming environments capable of attracting and retaining users.
Abstract: Identity means to display ourselves how we want others to perceive us. How people construct their identities has been an important concern. Because, identity is an important mode of lifestyle. Language has been in close relationship with identity. Role of language in maintenance of identity has been obvious in many renowned works. The Present study investigated the role of language in constructing ethnic identity and data interpretation revealed the need and importance of language for maintenance of identity.
Keywords: Identity, maintenance, language, ethnic group, researchers.
The Case for Teaching Mandarin Chinese in HISDHarvin Moore
The document makes the case for teaching Mandarin Chinese in the Houston Independent School District. It summarizes research showing that starting world language education early (K-5) is common practice in most industrialized countries. Only a small number of US states require world language education, and even fewer require it at the elementary level. Studies also show that learning a second language improves academic performance in other subjects and helps close achievement gaps. Furthermore, foreign language ability is important for national interests like security, economic competitiveness, and an informed citizenry. As China is projected to be a major future trade partner, learning Mandarin could position Houston students and the city well economically.
This document provides Pinfan Zhu's curriculum vitae. It outlines her educational background, including degrees earned from Texas Tech University, Kunming University of Science and Technology, and Guangxi Normal University. It also details her professional experience, including positions held at Texas State University-San Marcos, Texas Tech University, and institutions in China. The CV highlights Zhu's teaching experience, scholarly publications, conference presentations, translations, honors and awards.
45 minute presentation of the design process of a CHI Design Competition submission for a GE audience. There were 33 in attendance and 80 viewing via Cisco WebEx.
This document is a curriculum vitae for Jeffrey R. Tharsen. It summarizes his educational and professional background. He received a Ph.D. in East Asian Languages and Civilizations from the University of Chicago, with a dissertation on Chinese phonetic patterns and literary artistry. He currently works as a Digital Humanities Research Specialist and Computing Consultant at the University of Chicago Research Computing Center. Prior experience includes teaching positions at various universities and translation work. He has published papers on topics related to Chinese linguistics and literature and developed several digital tools to support humanities research.
Discovering Culture in Social Media and a Brief Case of Collective MemoryRuth Garcia Gavilanes
'Discovering Cultural Trails in Social Media and Collective Memory in Wikipedia', with speaker Dr Ruth García-Gavilanes, Oxford Internet Institute.
As a computational social science researcher, Ruth is interested in understanding online footprints, utilizing/developing computational methods and leveraging big data. In this seminar Ruth will present two case studies in this field: a) a study of how one’s action on Twitter (e.g., deciding when to post messages) is linked to one’s culture (e.g., country’s Pace of Life) and b) a case study of how collective memories can be measured using Wikipedia articles related to aircraft incidents and accidents.
Visualising Wikipedia Controversies: a look inside ContropediaDavid Laniado
This document introduces Contropedia, a tool that visualizes controversies within Wikipedia articles. It analyzes edit histories to identify the most disputed concepts, locations within articles where controversies are concentrated, and the timeline of disputes. Contropedia represents this data through layer views, dashboards, and detailed edit histories to increase transparency and foster understanding of debates. It aims to help researchers, Wikipedians, teachers and citizens better comprehend controversial topics and participation on Wikipedia.
Abstract: Identity means to display ourselves how we want others to perceive us. How people construct their identities has been an important concern. Because, identity is an important mode of lifestyle. Language has been in close relationship with identity. Role of language in maintenance of identity has been obvious in many renowned works. The Present study investigated the role of language in constructing ethnic identity and data interpretation revealed the need and importance of language for maintenance of identity.
Keywords: Identity, maintenance, language, ethnic group, researchers.
The Case for Teaching Mandarin Chinese in HISDHarvin Moore
The document makes the case for teaching Mandarin Chinese in the Houston Independent School District. It summarizes research showing that starting world language education early (K-5) is common practice in most industrialized countries. Only a small number of US states require world language education, and even fewer require it at the elementary level. Studies also show that learning a second language improves academic performance in other subjects and helps close achievement gaps. Furthermore, foreign language ability is important for national interests like security, economic competitiveness, and an informed citizenry. As China is projected to be a major future trade partner, learning Mandarin could position Houston students and the city well economically.
This document provides Pinfan Zhu's curriculum vitae. It outlines her educational background, including degrees earned from Texas Tech University, Kunming University of Science and Technology, and Guangxi Normal University. It also details her professional experience, including positions held at Texas State University-San Marcos, Texas Tech University, and institutions in China. The CV highlights Zhu's teaching experience, scholarly publications, conference presentations, translations, honors and awards.
45 minute presentation of the design process of a CHI Design Competition submission for a GE audience. There were 33 in attendance and 80 viewing via Cisco WebEx.
This document is a curriculum vitae for Jeffrey R. Tharsen. It summarizes his educational and professional background. He received a Ph.D. in East Asian Languages and Civilizations from the University of Chicago, with a dissertation on Chinese phonetic patterns and literary artistry. He currently works as a Digital Humanities Research Specialist and Computing Consultant at the University of Chicago Research Computing Center. Prior experience includes teaching positions at various universities and translation work. He has published papers on topics related to Chinese linguistics and literature and developed several digital tools to support humanities research.
Discovering Culture in Social Media and a Brief Case of Collective MemoryRuth Garcia Gavilanes
'Discovering Cultural Trails in Social Media and Collective Memory in Wikipedia', with speaker Dr Ruth García-Gavilanes, Oxford Internet Institute.
As a computational social science researcher, Ruth is interested in understanding online footprints, utilizing/developing computational methods and leveraging big data. In this seminar Ruth will present two case studies in this field: a) a study of how one’s action on Twitter (e.g., deciding when to post messages) is linked to one’s culture (e.g., country’s Pace of Life) and b) a case study of how collective memories can be measured using Wikipedia articles related to aircraft incidents and accidents.
Visualising Wikipedia Controversies: a look inside ContropediaDavid Laniado
This document introduces Contropedia, a tool that visualizes controversies within Wikipedia articles. It analyzes edit histories to identify the most disputed concepts, locations within articles where controversies are concentrated, and the timeline of disputes. Contropedia represents this data through layer views, dashboards, and detailed edit histories to increase transparency and foster understanding of debates. It aims to help researchers, Wikipedians, teachers and citizens better comprehend controversial topics and participation on Wikipedia.
Emotions under Discussion: Gender, Status and Communication in WikipediaDavid Laniado
I present a large-scale analysis of emotional expression and communication style of editors in Wikipedia discussions. The presentation focuses especially on how emotion and dialogue differ depending on the status, gender, and the communication network of the about 12000 editors who have written at least 100 comments on the English Wikipedia's article talk pages. The analysis is based on three different predefined lexicon-based methods for quantifying emotions: ANEW, LIWC and SentiStrength. The results unveil significant differences in the emotional expression and communication style of editors according to their status and gender, and can help to address issues such as gender gap and editors' decline.
Abstract: Identity means to display ourselves how we want others to perceive us. How people construct their identities has been an important concern. Because, identity is an important mode of lifestyle. Language has been in close relationship with identity. Role of language in maintenance of identity has been obvious in many renowned works. The Present study investigated the role of language in constructing ethnic identity and data interpretation revealed the need and importance of language for maintenance of identity.
This document discusses two methods for learning a foreign language: Livemocha, an online language learning community, and face-to-face interaction. It proposes researching these methods by having students in a Spanish class interact online with Livemocha or in person with native speakers over one semester. The study aims to understand which method provides more motivation, cultural awareness, vocabulary gains, and is preferred by students by analyzing chat logs, surveys, and pre/post vocabulary tests.
Visualizing social interactions in Wikipedia - WikiCorp 2018David Laniado
This document describes research on visualizing social interactions and controversies on Wikipedia. Key points:
- Researchers developed Contropedia, a tool to analyze controversial elements within Wikipedia articles over time using edit histories. It identifies controversial topics, when they were most disputed, and perspectives from different language versions.
- Controversiality is measured by counting disagreeing edits involving specific topics. The tool represents discussions as trees and networks to analyze complexity.
- Analysis found political interactions on articles are neutral, while personal talk pages show homophily. Women express more positive emotions and are more relationship-oriented in discussions.
- Contropedia aims to increase transparency on knowledge negotiations and perspectives behind published content on Wikipedia.
Director Lee Rainie describes how libraries can be actors in building and participating in social networks through their use of social media such as Facebook, Twitter, and blogging and through delivering their time-tested — and trusted — services to their patrons. More: http://pewinternet.org/Presentations/2011/May/San-Francisco-Public-Library.aspx
This document lists 15 articles that have been approved for publication in 2012 issues of several journals published by the National Forum Journals. The articles cover topics such as encouraging girls in science, homeland security programs, male sexual addiction, theories of learning, library accreditation, English language learning programs, bilingual education perceptions, synthetic marijuana, effects of language programs, influences on college student persistence, mobile classrooms in higher education, retention rates at 2-year colleges, developmental mathematics courses, cyber bullying prevention, and missions among the Kafe people in Papua New Guinea. The document also provides brief background information on National Forum Journals, which was founded in 1982 and publishes several peer-reviewed professional journals.
When participating online, individuals draw on the limited cues they have available to create for themselves an imagined audience (Litt, 2012). Such audiences shape users’ social media practices, and thus the expression of identity online (Marwick & boyd, 2011). In this research we posed the following questions: (1) how do scholars conceptualize their audiences when participating on social media, and (2) how does that conceptualization impact their self-expression online? By answering these questions, we aim to provide a more nuanced picture of scholars’ social media practices and experiences. The audiences imagined by the scholars we interviewed appear to be well defined rather than the nebulous constructions often described in previous studies (e.g. Brake, 2012; Vitak, 2012). While scholar indicated that some audiences were unknown, none noted that their audience was unfamiliar. This study also shows that a misalignment exists between the audiences that scholars imagine encountering online and the audiences that higher education institutions imagine their scholars encountering online.
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...Sina Institute
The document discusses computational modeling of sociopragmatic language use in Arabic and English social media. It outlines several interesting sociolinguistic phenomena that can be observed in social media texts, such as the existence of multiple viewpoints, influencers, and disputed topics. The approach involves identifying relevant language uses in the text and correlating them with linguistic constructions to gain an understanding of social constructs and relations.
The document provides an overview of Web 2.0 and social media. It discusses the evolution from Web 1.0 to Web 2.0, how users have shifted from passive consumers to active participants and creators of content. It also analyzes different types of social media users and their level of participation, from creators and critics to collectors, joiners and spectators. Additionally, it questions whether Microsoft can be considered Web 2.0 based on some of their strategies and services.
This document provides an agenda for teaching nonfiction in a digital space. It discusses defining information and digital literacy. It also covers reading and writing on the web and lesson planning. Several sections analyze data on internet usage among youth, families, and low-income groups to demonstrate that students are engaged with digital media and online content. It also outlines conventions and elements of online text for students to analyze. Finally, it provides examples of digital tools and websites to incorporate new media literacy into lessons.
Open and Participatory Environments in Language LearningBarbara Dieu
This document summarizes a presentation given by Barbara Dieu and Patricia Glogowski at the 49th TESOL Conference in New York on open participatory media environments for language learning. It discusses their teaching contexts, how they have incorporated social tools and platforms into their curriculum, their perceptions of open and participatory online environments, and challenges of using these tools for language learning.
The document outlines the schedule and goals for a Project Look Sharp summer institute on media literacy education. The schedule includes curriculum integration sessions in the morning and media production workshops in the afternoon. The goals are to introduce participants to media literacy theory and practice and help them develop media literacy lesson plans to use in their own educational contexts.
This document provides an overview of evaluating pro and con arguments for whether social networking sites are good for society. It discusses evaluating evidence and arguments, considering key questions to guide the review of the issue, and analyzing frames to assess the strongest case for each position. Sample pro arguments are provided, including that social media spreads information faster, helps students and businesswomen, facilitates relationships and political change, and empowers individuals.
This document summarizes a presentation on using online tools to improve reading comprehension. It discusses how tools like electronic books, podcasts, VoiceThread, Wordia, and social networking can support reading instruction before, during and after reading. Specific comprehension strategies are outlined that each tool supports, such as developing vocabulary, building fluency, encouraging predictions, and allowing students to summarize and synthesize what they've read. The presentation concludes by announcing an upcoming seminar on visual literacy.
Going Social: What You Need to Know to Launch a Social Media StrategyJim Rattray
1) To launch a successful social media strategy, you need to understand your audience, choose appropriate channels, and establish your voice and message.
2) It is also important to consider privacy, confidentiality, engaging your audience, and measuring your results.
3) The future of social media includes greater patient engagement through tools like electronic medical records and patient portals, as well as changes driven by mobile access.
Presentation by Svetlana Dembovskaya, Loyola University Chicago, and Liudmila Klimanova, University of Iowa, at the Language Symposium 2012, hosted at the University of Illinois at Chicago (UIC).
As integration of Internet-based social networking sites (SNSs) becomes increasingly popular in foreign language classrooms, the use of SNSs is particularly critical in the context of teaching less commonly taught languages, where students' exposure to the target language and its users is usually limited or even minimal. A foreign language educator, however, should be cautioned against the seemingly culturally unbiased nature of social networking environments. Recent studies show that, in online community spaces, cultural values and norms are established using methods similar to those of offline communities (see, for example, Hanna & de Nooy, 2003, 2009; Pasfield-Neofitou, 2011). We designed a project spanning two semesters that brought a rich and authentic target language social networking community, VKontakte, into Russian beginning and intermediate college-level classes. At the same time, we provided continuous structured guidance and regular opportunities for American students to reflect individually and in groups on their emerging insights into culturally determined uniqueness of the VKontakte online community. The students created their own profile pages, worked with students in partner universities in Russia and the Ukraine to complete weekly communicative tasks in Russian, and participated in online discussion forums. Analysis of students' weekly reflections and interactions with keypals appears to show that, over the course of the project, students developed more sensitivity to culturally salient features of the Russia-based social-networking community. Yet, the instructor's guidance was instrumental in developing culturally appropriate interpretation of Russian online culture. In conclusion, we will discuss the rewards and challenges of integrating social networking projects into foreign language classroom instruction.
21st Century Skills: What do Adult Educators Need to Know?Marian Thacher
This document discusses how 21st century skills have changed and what adult learners need to know to thrive. It focuses on how technology is changing reading, communication, and education. Key points discussed include how digital textbooks and eBooks are becoming more common, how social media like Facebook and Twitter can be used for learning, and how smartphones are increasingly how people access the internet. Skills like creativity, collaboration, and digital literacy are emphasized as important for both employment and further education.
Wikipedia Cultural Diversity Dataset - ICWSM 2019David Laniado
In this paper we present the Wikipedia Cultural Diversity dataset. For each existing Wikipedia language edition, the dataset contains a classification of the articles that represent its associated cultural context, i.e. all concepts and entities related to the language and to the territories where it is spoken. We describe the methodology we employed to classify articles, and the rich set of features that we defined to feed the classifier, and that are released as part of the dataset. We present several purposes for which we envision the use of this dataset, including detecting, measuring and countering content gaps in the Wikipedia project, and encouraging cross-cultural research in the field of digital humanities.
Emotions under Discussion: Gender, Status and Communication in WikipediaDavid Laniado
I present a large-scale analysis of emotional expression and communication style of editors in Wikipedia discussions. The presentation focuses especially on how emotion and dialogue differ depending on the status, gender, and the communication network of the about 12000 editors who have written at least 100 comments on the English Wikipedia's article talk pages. The analysis is based on three different predefined lexicon-based methods for quantifying emotions: ANEW, LIWC and SentiStrength. The results unveil significant differences in the emotional expression and communication style of editors according to their status and gender, and can help to address issues such as gender gap and editors' decline.
Abstract: Identity means to display ourselves how we want others to perceive us. How people construct their identities has been an important concern. Because, identity is an important mode of lifestyle. Language has been in close relationship with identity. Role of language in maintenance of identity has been obvious in many renowned works. The Present study investigated the role of language in constructing ethnic identity and data interpretation revealed the need and importance of language for maintenance of identity.
This document discusses two methods for learning a foreign language: Livemocha, an online language learning community, and face-to-face interaction. It proposes researching these methods by having students in a Spanish class interact online with Livemocha or in person with native speakers over one semester. The study aims to understand which method provides more motivation, cultural awareness, vocabulary gains, and is preferred by students by analyzing chat logs, surveys, and pre/post vocabulary tests.
Visualizing social interactions in Wikipedia - WikiCorp 2018David Laniado
This document describes research on visualizing social interactions and controversies on Wikipedia. Key points:
- Researchers developed Contropedia, a tool to analyze controversial elements within Wikipedia articles over time using edit histories. It identifies controversial topics, when they were most disputed, and perspectives from different language versions.
- Controversiality is measured by counting disagreeing edits involving specific topics. The tool represents discussions as trees and networks to analyze complexity.
- Analysis found political interactions on articles are neutral, while personal talk pages show homophily. Women express more positive emotions and are more relationship-oriented in discussions.
- Contropedia aims to increase transparency on knowledge negotiations and perspectives behind published content on Wikipedia.
Director Lee Rainie describes how libraries can be actors in building and participating in social networks through their use of social media such as Facebook, Twitter, and blogging and through delivering their time-tested — and trusted — services to their patrons. More: http://pewinternet.org/Presentations/2011/May/San-Francisco-Public-Library.aspx
This document lists 15 articles that have been approved for publication in 2012 issues of several journals published by the National Forum Journals. The articles cover topics such as encouraging girls in science, homeland security programs, male sexual addiction, theories of learning, library accreditation, English language learning programs, bilingual education perceptions, synthetic marijuana, effects of language programs, influences on college student persistence, mobile classrooms in higher education, retention rates at 2-year colleges, developmental mathematics courses, cyber bullying prevention, and missions among the Kafe people in Papua New Guinea. The document also provides brief background information on National Forum Journals, which was founded in 1982 and publishes several peer-reviewed professional journals.
When participating online, individuals draw on the limited cues they have available to create for themselves an imagined audience (Litt, 2012). Such audiences shape users’ social media practices, and thus the expression of identity online (Marwick & boyd, 2011). In this research we posed the following questions: (1) how do scholars conceptualize their audiences when participating on social media, and (2) how does that conceptualization impact their self-expression online? By answering these questions, we aim to provide a more nuanced picture of scholars’ social media practices and experiences. The audiences imagined by the scholars we interviewed appear to be well defined rather than the nebulous constructions often described in previous studies (e.g. Brake, 2012; Vitak, 2012). While scholar indicated that some audiences were unknown, none noted that their audience was unfamiliar. This study also shows that a misalignment exists between the audiences that scholars imagine encountering online and the audiences that higher education institutions imagine their scholars encountering online.
Mona Diab: Computational Modeling of Sociopragmatic Language Use in Arabic an...Sina Institute
The document discusses computational modeling of sociopragmatic language use in Arabic and English social media. It outlines several interesting sociolinguistic phenomena that can be observed in social media texts, such as the existence of multiple viewpoints, influencers, and disputed topics. The approach involves identifying relevant language uses in the text and correlating them with linguistic constructions to gain an understanding of social constructs and relations.
The document provides an overview of Web 2.0 and social media. It discusses the evolution from Web 1.0 to Web 2.0, how users have shifted from passive consumers to active participants and creators of content. It also analyzes different types of social media users and their level of participation, from creators and critics to collectors, joiners and spectators. Additionally, it questions whether Microsoft can be considered Web 2.0 based on some of their strategies and services.
This document provides an agenda for teaching nonfiction in a digital space. It discusses defining information and digital literacy. It also covers reading and writing on the web and lesson planning. Several sections analyze data on internet usage among youth, families, and low-income groups to demonstrate that students are engaged with digital media and online content. It also outlines conventions and elements of online text for students to analyze. Finally, it provides examples of digital tools and websites to incorporate new media literacy into lessons.
Open and Participatory Environments in Language LearningBarbara Dieu
This document summarizes a presentation given by Barbara Dieu and Patricia Glogowski at the 49th TESOL Conference in New York on open participatory media environments for language learning. It discusses their teaching contexts, how they have incorporated social tools and platforms into their curriculum, their perceptions of open and participatory online environments, and challenges of using these tools for language learning.
The document outlines the schedule and goals for a Project Look Sharp summer institute on media literacy education. The schedule includes curriculum integration sessions in the morning and media production workshops in the afternoon. The goals are to introduce participants to media literacy theory and practice and help them develop media literacy lesson plans to use in their own educational contexts.
This document provides an overview of evaluating pro and con arguments for whether social networking sites are good for society. It discusses evaluating evidence and arguments, considering key questions to guide the review of the issue, and analyzing frames to assess the strongest case for each position. Sample pro arguments are provided, including that social media spreads information faster, helps students and businesswomen, facilitates relationships and political change, and empowers individuals.
This document summarizes a presentation on using online tools to improve reading comprehension. It discusses how tools like electronic books, podcasts, VoiceThread, Wordia, and social networking can support reading instruction before, during and after reading. Specific comprehension strategies are outlined that each tool supports, such as developing vocabulary, building fluency, encouraging predictions, and allowing students to summarize and synthesize what they've read. The presentation concludes by announcing an upcoming seminar on visual literacy.
Going Social: What You Need to Know to Launch a Social Media StrategyJim Rattray
1) To launch a successful social media strategy, you need to understand your audience, choose appropriate channels, and establish your voice and message.
2) It is also important to consider privacy, confidentiality, engaging your audience, and measuring your results.
3) The future of social media includes greater patient engagement through tools like electronic medical records and patient portals, as well as changes driven by mobile access.
Presentation by Svetlana Dembovskaya, Loyola University Chicago, and Liudmila Klimanova, University of Iowa, at the Language Symposium 2012, hosted at the University of Illinois at Chicago (UIC).
As integration of Internet-based social networking sites (SNSs) becomes increasingly popular in foreign language classrooms, the use of SNSs is particularly critical in the context of teaching less commonly taught languages, where students' exposure to the target language and its users is usually limited or even minimal. A foreign language educator, however, should be cautioned against the seemingly culturally unbiased nature of social networking environments. Recent studies show that, in online community spaces, cultural values and norms are established using methods similar to those of offline communities (see, for example, Hanna & de Nooy, 2003, 2009; Pasfield-Neofitou, 2011). We designed a project spanning two semesters that brought a rich and authentic target language social networking community, VKontakte, into Russian beginning and intermediate college-level classes. At the same time, we provided continuous structured guidance and regular opportunities for American students to reflect individually and in groups on their emerging insights into culturally determined uniqueness of the VKontakte online community. The students created their own profile pages, worked with students in partner universities in Russia and the Ukraine to complete weekly communicative tasks in Russian, and participated in online discussion forums. Analysis of students' weekly reflections and interactions with keypals appears to show that, over the course of the project, students developed more sensitivity to culturally salient features of the Russia-based social-networking community. Yet, the instructor's guidance was instrumental in developing culturally appropriate interpretation of Russian online culture. In conclusion, we will discuss the rewards and challenges of integrating social networking projects into foreign language classroom instruction.
21st Century Skills: What do Adult Educators Need to Know?Marian Thacher
This document discusses how 21st century skills have changed and what adult learners need to know to thrive. It focuses on how technology is changing reading, communication, and education. Key points discussed include how digital textbooks and eBooks are becoming more common, how social media like Facebook and Twitter can be used for learning, and how smartphones are increasingly how people access the internet. Skills like creativity, collaboration, and digital literacy are emphasized as important for both employment and further education.
Similar to Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions (20)
Wikipedia Cultural Diversity Dataset - ICWSM 2019David Laniado
In this paper we present the Wikipedia Cultural Diversity dataset. For each existing Wikipedia language edition, the dataset contains a classification of the articles that represent its associated cultural context, i.e. all concepts and entities related to the language and to the territories where it is spoken. We describe the methodology we employed to classify articles, and the rich set of features that we defined to feed the classifier, and that are released as part of the dataset. We present several purposes for which we envision the use of this dataset, including detecting, measuring and countering content gaps in the Wikipedia project, and encouraging cross-cultural research in the field of digital humanities.
Contropedia: Critical learning through Wikipedia's edit historyDavid Laniado
Presentation at the Euroclio Annual Conference "Mediterranean Dialogues" in Marseille, France, April 24, 2018
Wikipedia is not only the largest and most popular encyclopedia, it is also one of the largest collaborative platforms that involves a worldwide community spread over more than 200 different language editions. Its articles are not static pieces of knowledge, but can be edited (almost) anytime by anyone.
The value of Wikipedia content is guaranteed less by absence of errors than by their constant "improvability". Wikipedia’s core principle, "neutral point of view" (NPOV), allows editors with different viewpoints to correct each other by rewriting an article so that all significant viewpoints are represented with due weight. The quality of Wikipedia, in other words, is made possible by the struggle over its content.
Such conflict over content often also reflect societal debates on the corresponding topics, although they are difficult to inspect through Wikipedia's interface. Contropedia provides a visual interface for making such information easily accessible and allows users to identify the elements that aroused most dispute and activity, as well as the topical development of an article. As the tool is language-agnostic it can be applied to any language edition, and allows for cross-cultural comparisons of viewpoints and societal debates. A demo of the tool is available at: http://contropedia.net/demo/
Contropedia can help history teachers to foster critical thinking by exposing knowledge as a collective construction, as the fruit of confrontation among different points of view that may vary across cultures and over time, rather than as something absolute and immutable.
Gender patterns on a large social network (SocInfo 2014)David Laniado
This document analyzes gender patterns in a large online social network with over 10 million users. It finds that both men and women exhibit homophily or a preference for same-gender connections, though this tendency is stronger for women, especially in the early stages of joining the network. Both genders' friend networks and interactions tend to form more single-gender triangles than would be expected by chance. However, users with many friends show a tendency toward heterophily or connecting with other genders. The findings suggest women perceive the presence of other women as important for entering a new online social space, which could explain challenges in addressing the gender gap.
Dinámicas de Discusión en Red: Conflicto, Deliberación, Consenso y RolesDavid Laniado
Presentación en la UOC sobre lineas futuras de investigación para estudiar el movimiento 15m a través de conversaciones en red. http://datanalysis15m.wordpress.com/
Emotions and dialogue in a peer-production community: the case of WikipediaDavid Laniado
Slides presented at WikiSym 2012.
This paper presents a large-scale analysis of emotions in conversations among Wikipedia editors. Our focus is on the emotions expressed by editors in talk pages, measured by using the Affective Norms for English Words (ANEW).
We find evidence that to a large extent women tend to participate in discussions with a more positive tone, and that administrators are more positive than non-administrators. Surprisingly, female non-administrators tend to behave like administrators in many aspects.
We observe that replies are on average more positive than the comments they reply to, preventing many discussions from spiralling down into conflict. We also find evidence of emotional homophily: editors having similar emotional styles are more likely to interact with each other.
Our findings offer novel insights into the emotional dimension of interactions in peer-production communities, and contribute to debates on issues such as the flattening of editor growth and the gender gap.
When the Wikipedians talk: network and tree structure of Wikipedia discussion...David Laniado
Talk pages play a fundamental role in Wikipedia as the place for discussion and communication. In this work we use the comments on these pages to extract and study three networks, corresponding to different kinds of interactions. We find evidence of a specific assortativity profile which differentiates article discussions from personal conversations. An analysis of the tree structure of the article talk pages allows to capture patterns of interaction, and reveals structural differences among the discussions about articles from different semantic areas.
Introduction to Jio Cinema**:
- Brief overview of Jio Cinema as a streaming platform.
- Its significance in the Indian market.
- Introduction to retention and engagement strategies in the streaming industry.
2. **Understanding Retention and Engagement**:
- Define retention and engagement in the context of streaming platforms.
- Importance of retaining users in a competitive market.
- Key metrics used to measure retention and engagement.
3. **Jio Cinema's Content Strategy**:
- Analysis of the content library offered by Jio Cinema.
- Focus on exclusive content, originals, and partnerships.
- Catering to diverse audience preferences (regional, genre-specific, etc.).
- User-generated content and interactive features.
4. **Personalization and Recommendation Algorithms**:
- How Jio Cinema leverages user data for personalized recommendations.
- Algorithmic strategies for suggesting content based on user preferences, viewing history, and behavior.
- Dynamic content curation to keep users engaged.
5. **User Experience and Interface Design**:
- Evaluation of Jio Cinema's user interface (UI) and user experience (UX).
- Accessibility features and device compatibility.
- Seamless navigation and search functionality.
- Integration with other Jio services.
6. **Community Building and Social Features**:
- Strategies for fostering a sense of community among users.
- User reviews, ratings, and comments.
- Social sharing and engagement features.
- Interactive events and campaigns.
7. **Retention through Loyalty Programs and Incentives**:
- Overview of loyalty programs and rewards offered by Jio Cinema.
- Subscription plans and benefits.
- Promotional offers, discounts, and partnerships.
- Gamification elements to encourage continued usage.
8. **Customer Support and Feedback Mechanisms**:
- Analysis of Jio Cinema's customer support infrastructure.
- Channels for user feedback and suggestions.
- Handling of user complaints and queries.
- Continuous improvement based on user feedback.
9. **Multichannel Engagement Strategies**:
- Utilization of multiple channels for user engagement (email, push notifications, SMS, etc.).
- Targeted marketing campaigns and promotions.
- Cross-promotion with other Jio services and partnerships.
- Integration with social media platforms.
10. **Data Analytics and Iterative Improvement**:
- Role of data analytics in understanding user behavior and preferences.
- A/B testing and experimentation to optimize engagement strategies.
- Iterative improvement based on data-driven insights.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions
1. Gender Gap in Collaborative Platforms:
Language and emotions in Wikipedia Discussions
David Laniado, Daniela Iosub, Carlos Castillo,
Mayo Fuster Morell and Andreas Kaltenbrunner
david.laniado@eurecat.org
Universitat Pompeu Fabra, January 17, 2017
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 1 / 58
2. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 2 / 58
3. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 3 / 58
4. Wikipedia is a teenager
Happy birthday
Wikipedia!
English Wikipedia is
now 16 years old
Catalan Wikipedia will
be 16 in March (the
second oldest one)
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 4 / 58
5. The largest human knowledge repository
Fifth most visited web site
Among top results for search queries about almost any topic
Largest collaborative project
Conditions and reflects public opinion... with some bias
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 5 / 58
6. Wikipedia’s social experiment
"The problem with Wikipedia is that it only works in practice. In theory,
it can never work."
(Wikipedian popular joke)
A crazy idea: anyone can edit
All relevant points of view should be represented
Policies of Notability and Neutral Point of view
Quality assured by editors’ negotiations over content
The more people with different points of view contributing, the
better the quality
→ Biases in the editor community may cause biases in the content
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 6 / 58
7. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 7 / 58
8. Bias in the content?
Top global biographies by birth country (Young-Ho Eom et al, 2015)
top central biographies from each of the 24 major Wikipedias
the 100 most central (PageRank) in each version’s hyperlink network
→ striking geographic bias
http://www.quantware.ups-tlse.fr/QWLIB/topwikipeople/geofigs/pagerank24x100.html
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 8 / 58
9. Top Women Biographies
Rank NA PageRank female figures CC Century LC
1 24 Elizabeth II UK 20 EN
2 17 Mary (mother of Jesus) IL -1 HE
3 12 Queen Victoria UK 19 EN
4 6 Elizabeth I of England UK 16 EN
5 2 Maria Theresa AT 18 DE
6 1 Benazir Bhutto PK 20 HI
7 1 Catherine the Great PL 18 PL
8 1 Anne Frank DE 20 DE
9 1 Indira Gandhi IN 20 HI
10 1 Margrethe II of Denmark DK 20 DA
Top 10 global female historical figures by PageRank for the 24 major
Wikipedia editions (Young-Ho Eom et al, 2015)
NA → number of language editions in which a biography appears in the
top 100 rank
CC → birth country code
LC → language code corresponding to the birth country
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 9 / 58
10. Top biographies by gender
Number of women among the top global biographies by birth century
Number of women among the 100 most central biographies for each
language edition
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 10 / 58
11. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 11 / 58
12. Wikipedia editor gender gap
Estimated women participation
2011 editor survey: 9%
2013 editor survey: 13%
corrected with propensity score estimation: 16% (Mako Hill and
Shaw, 2013)
while in most online social networks women are more active!
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 12 / 58
13. Why women do not edit Wikipedia?
1 A lack of user-friendliness in the editing interface
2 Not having enough free time
3 A lack of self-confidence
4 Aversion to conflict and an unwillingness to participate in lengthy
edit wars
5 Belief that their contributions are too likely to be reverted or
deleted
6 Some find its overall atmosphere misogynistic
7 Wikipedia culture is sexual in ways they find off-putting
8 Being addressed as male is off-putting to women whose primary
language has grammatical gender
9 Fewer opportunities than other sites for social relationships and a
welcoming tone
(Sue Gardener, 2011)
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 13 / 58
14. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 14 / 58
15. Emotional factors and discussions
Importance of
emotional factors
Discussion spaces are
fundamental to the
collaborative process
Discussion triggers
emotions and breeds
particular emotional
environments
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 15 / 58
16. Wikipedia’s most visible side
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 16 / 58
17. Article talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 17 / 58
18. Discussions in article talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 18 / 58
19. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 19 / 58
20. Studying emotions and language in talk pages
Interactions in Wikipedia
Implicit → editing
Explicit → communication
Article talk pages → discussions about how to improve articles
User talk pages → a kind of public in-boxes
Goal: Shed light on the emotional dimension of the interactions
extensive analysis of emotions in explicit communication
sentiment analysis of comments in article talk and personal talk
pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 20 / 58
21. Research questions
1 How are the emotional and communication styles of editors
affected by their status?
2 How are the emotional and communication styles of editors
affected by their gender?
3 How are the emotional expressions affected by interacting with
others in comment threads (emotional congruence)?
4 How are the emotional styles of editors related to those of the
editors they interact more frequently with (emotional
homophily)?
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 21 / 58
22. Publications
Results published in:
Laniado, D., Castillo, C., Kaltenbrunner, A., and Fuster Morell, M. F. (2012)
Emotions and dialogue in a peer-production community: the case of Wikipedia.
8th International Symposium on Wikis and Open Collaboration, WikiSym’12
Iosub, D., Laniado, D., Castillo, C., Fuster Morell, M. F., and Kaltenbrunner, A. (2014)
Emotions under Discussion: Gender, Status and Communication in Online Collaboration.
Plos One, 9(8)
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 22 / 58
23. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 23 / 58
24. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 24 / 58
25. Dataset: conversations
Extracting conversations among editors from the English Wikipedia
Articles 3 210 039
Articles with talk page (ATP) 871 485 (27.1%)
Editors who comment articles 350 958
Editors with ≥ 100 comments on ATP 12 231 (3.5%)
Total comments in ATP 11 041 246
Comments containing ANEW words 7 414 411 (67.2%)
Comments by editors with ≥ 100 comments on ATP 5 480 544 (49.6%)
Comments by these editors and with ANEW words 3 649 297 (33.3%)
Table: Data extracted from a complete dump of the English Wikipedia, dated
March 12th, 2010
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 25 / 58
26. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 26 / 58
27. User gender labelling
≈ 12 000 users wrote ≥ 100 comments in articles talk pages
Gender identified through Wikipedia API for ≈ 2 000 of them
A sample of 1 385 users for manual labelling through
crowdsourcing (Crowdflower)
Non-admins Admins Total
Men 1 087 1 526 2 613
Women 68 97 165
Unknown 6 850 2 603 9 453
Total 8 005 4 226 12 231
Table: Users with ≥ 100 comments by gender and administrator status.
Gender could be identified only for ≈ 50% of users:
real name or username (50% of those identified)
implicitly stated gender (27% of women, 20% of men)
pronoun (15% of women, 10% of men)
other indicators: userboxes, pictures, links to personal blogs...
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 27 / 58
28. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 28 / 58
29. Measuring the Emotional Content of Discussions
Lexicon-based methods
relying on three different instruments:
Affective norms for English words (ANEW)
Linguistic Inquiry and Word Count (LIWC)
SentiStrength
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 29 / 58
30. Measuring the Emotional Content of Discussions
Method 1: Affective norms for English words (ANEW)
Rates a list of 1060 frequent words on a 9 point scale in three
dimensions:
Valence
Arousal
Dominance
assign emotion scores to each word from the lexicon
Bradley and Lang. (1999).
Affective norms for English words (ANEW) Technical report C-1.
The Center for Research in Psychophysiology, University of Florida, FL.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 30 / 58
31. Measuring the Emotional Content of Discussions
Method 2: Linguistic Inquiry and Word Count (LIWC)
Two scores for basic emotion (compared with ANEW valence)
positive emotion
negative emotion
Discrete measures of emotions (anger, anxiety, sadness, affect)
Other classes of words to characterize language (i.e. personal
pronouns, tentative words, fillers...)
→ Count the proportion of words belonging to each class
Pennebaker J, Chung C, Ireland M, Gonzales A, Booth R (2010).
The development and psychometric properties of LIWC2007. Austin, TX.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 31 / 58
32. Measuring the Emotional Content of Discussions
Method 2: Linguistic Inquiry and Word Count (LIWC)
Dictionary size Examples
Anger 91 hate, kill, annoyed
Anxiety 84 worried, fearful, nervous
Sadness 101 crying, grief, sad
Tentative 155 maybe, perhaps, guess
Certainty 83 always, never
Fillers 9 blah, you know
Past 155 went, ran, had
Present 169 is, does, hear
Future 48 will, gonna
Social words 455 mate, talk, child
Table: Description of LIWC measures (as per http://www.liwc.net).
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 32 / 58
33. Measuring Relationship-Orientation with LIWC
Definition
Communication that promotes social
affiliation and emotional connection:
preoccupation with others (use of
personal pronouns, e.g., I, you)
preoccupation with the larger social
domain (e.g., references to friends and
family)
expression of positive emotion
Examples
We are glad to have you. If I can help at all let
me know :)
A-giau has smiled at you. Smiles promote
WikiLove and hopefully this one has made
your day better...Happy editing
Congrats! Thank you for your dedication.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 33 / 58
34. Measuring the Emotional Content of Discussions
Method 3: SentiStrength
SentiStrength
Based on LIWC and developed for short web texts
Accounts for modes of textual expression specific to the online
environment, e.g. emoticons and abbreviations
Provides a positive and a negative score for emotional valence
Emotion score is the strongest positive and negative emotion
expressed in a comment
Final scores are averages over comments in a given category
Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010)
Sentiment strength detection in short informal text.
Journal of the American Society for Information Science and Technology 61: 2544 – 2558.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 34 / 58
35. Example: results with three different emotional lexica
Table: Example messages with their corresponding Valence(ANEW) or
positive & negative scores (LIWC, SentiStrength)
ANEW LIWC SentiSt.
Valence + - + -
Sounds like a good challenge - to be proven or disproven. I’m
happy if it can be shown to go further using closed cubic poly-
nomial solutions. The nice thing about these are that they are
pretty easy to test numerically . . .
7.4 12.5 0 3 -2
–in “Exact trigonometric constants”
Seems you have not yet seen female lover after having sex
who do not wish to have sex with the same lover any more :)
Once you’ve seen it, you understand very well what war of Venus
means compared to war of Mars.
5.5 6.8 4.5 4 -3
–in “House (astrology)”
What about the whirlie hazing, the alcohol abuse, the emotional
poverty, the suicide in 1995/6, the biotech plans which were
stopped by pitzer protests . . .
1.6 4 8 1 -4
–in “Harvey Mudd College”
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 35 / 58
36. Sentiment analysis
Statistical tests
Compute average values with the three lexica for each user
Compare distribution of values for two groups of users (e.g.:
admins vs regulars, women vs men)
Most variables are not normally distributed
⇓
Mann-Whitney U-test
Compare distributions of rankings
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 36 / 58
37. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 37 / 58
38. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 38 / 58
39. Emotions and Status
Table: Emotions and Status: Administrators promote a generally neutral tone
on article talk pages. Regular editors express more negative emotion, and
are more emotional.
(Article Talk) Regular Admin Mann-Whitney U-Test p-value
LIWC
Positive 2.369 2.409 -4.308 p < 0.001
Negative 1.368 1.120 -18.578 p < 0.001
Affect 3.784 3.661 -8.466 p < 0.001
Anxiety 0.180 0.166 -5.834 p < 0.001
Anger 0.554 0.446 -19.217 p < 0.001
Sadness 0.175 0.166 -4.450 p < 0.001
SentiStrength
Positive 1.805 1.774 -14.603 p < 0.001
Negative -2.005 -1.912 -23.046 p < 0.001
When difference is statistically significant (p-value in bold) the larger absolute value is underlined
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 39 / 58
40. Emotions and Status
Admins:
more positive emotion
(ANEW and LIWC)
generally, emotionally
reserved compared to
regular users (LIWC)
Regular users:
more emotional
more affect, and more
anxiety, anger and
sadness (LIWC)
stronger positive and
negative words than
admins (SentiStrength)
Personal talk pages
In personal talk pages, admins are more emotional compared to
the article talk pages
more positive emotion compared to regular editors, but also more
anxiety and sadness
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 40 / 58
41. Dialogue and Status
Table: Dialogue and Status: Administrators are more impersonal in article talk
pages. Regular editors are more concerned with others.
(Article Talk) Regular Admin Mann-Whitney U-test p-value
Relationship-orientation
Personal pronouns 5.135 4.815 -13.561 p < 0.001
Use of “I” 2.456 2.429 -1.733 p=0.083
Use of “You” 1.043 0.892 -12.573 p < 0.001
Use of “Shehe” 0.609 0.526 -8.657 p < 0.001
Social words 6.320 5.810 -19.013 p < 0.001
Certainty
Certainty 1.426 1.317 -16.824 p < 0.001
Tentativeness 3.199 3.169 -2.210 p < 0.001
Filler words 0.168 0.155 -6.687 p < 0.001
Temporal Orientation
Past 2.376 2.305 -5.696 p < 0.001
Present 8.011 7.841 -8.060 p < 0.001
Future 1.114 1.166 -9.887 p < 0.001
When difference is statistically significant (p-value in bold) the larger absolute value is underlined
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 41 / 58
42. Dialogue and Status
Admins
more neutral and
impersonal tone
less relationship oriented
more concerned with the
future
tend to "rule with reason"
Regular users
more relationship-oriented
more personal pronouns
and more social words
more concerned with past
more insecure, but not in
personal spaces
more certainty, tentative
and filler words in article
talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 42 / 58
43. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 43 / 58
44. Emotions and gender
ANEW Words more used by women and men
Size accounts for difference in frequency
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 44 / 58
45. Emotions and gender
Women use words associated to more positive emotions
Result consistent and significant with the three lexicons
ANEW: Difference is not significant when normalising by article
→ difference might be due to topic selection: women choose to
participate in topics which have more positive discussions
No significant difference in expression of negative emotions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 45 / 58
46. Topics, emotions and gender
N≥1 ANEW words; corr=−0.64 (p=0.002)
prop. of male comments
meanvalence
0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96
4.7
4.8
4.9
5
5.1
5.2
5.3
5.4
5.5
5.6
Computing
Arts
Philosophy
Language
Health
Mathematics
Belief
Sports
Agriculture
Environment
Techn. & app. sci.
Law
Society
Business
Education
Culture
People
Science
Politics
Geography and places
History and events
Figure: Mean valence (ANEW) for discussions of articles in different topic
categories, vs the proportion of comments written by men
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 46 / 58
47. Dialogue and gender
Table: Dialogue and Gender: Women use a more relationship-oriented
speech style.
(Article Talk) Men Women Mann-Whitney U-test p-value
Relationship-orientation
Personal pronouns 4.964 5.420 -4.375 p < 0.001
Use of “I” 2.488 2.764 -3.945 p < 0.001
Use of “You” 0.936 0.957 -0.926 p=0.355
Use of “Shehe” pronouns 0.541 0.713 -4.657 p < 0.001
Social words 5.960 6.353 -3.487 p < 0.001
Certainty
Certainty 1.346 (1397) 1.300 (1263) -2.078 p = 0.038*
Tentativeness 3.150 3.215 -1.162 p=0.245
Filler words 0.161 0.160 -0.137 p=0.891
Temporal Orientation
Past 2.325 2.543 -4.305 p < 0.001
Present 7.897 8.180 -3.086 p = 0.002
Future 1.168 1.147 -1.008 p=0.314
When difference is statistically significant (p-value in bold) the larger absolute value is
underlined. Cases where the averages are not informative are marked with an asterisk * and
include the mean ranks Mann-Whitney U-test next to the averages in parentheses.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 47 / 58
48. Dialogue and Gender
Women write longer messages
Women are more relationship-oriented
more personal pronouns, in particular “I”, more social words
Women are not more insecure
Less certainty words, no significant difference for tentativeness and
filler words
Women admins are more relationship oriented than men admins
Different leadership style
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 48 / 58
49. Qualitative analysis: Relationship orientation
Manual classification of 100 comments
Three main types of comments high in relationship-orientation:
inviting comments that explain the edit in a friendly tone, and call for
further intervention and collaboration
common perspective-building comments that are focused on
understanding others and solving debates in a constructive manner
appreciative comments that contain positive emotions and
celebrate others’ actions
⇒ This suggests that relationship-orientation may be conducive to
successful collaboration
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 49 / 58
50. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 50 / 58
51. Emotional congruence
Comparison of each comment with the comment it replies to
not based only on our set of users, but on all comments (from all
users)
Emotions: editors tend to reply with:
more positive emotion
less negative emotion
less anger, anxiety and sadness
stronger words, both positive and negative (SentiStrength)
Dialogue: editors tend to reply with:
more relationship oriented speech
less tentative and certainty words
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 51 / 58
52. Emotional homophily
Mixing patterns: do users interact preferentially with similar users?
Disassortativity by activity
users who write more comments tend to reply preferentially to less
active users, and viceversa
Assortativity by gender
Men interact more with other men, and women with other women
Assortativity by emotion and language
Users interact more with others similar in emotional expression and
communication style
also in the network of communication on personal talk pages
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 52 / 58
53. Emotional homophily
Example: homophily by expression of anger
edges connect users who
have exchanged at least 10
replies
node color represents the
level of anger expressed by a
user, from low to high
node size → proportional to
the number of connections of
a user
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 53 / 58
54. Outline
1 Introduction
Wikipedia content biases
Wikipedia gender gap
Wikipedia discussion spaces
Goal and research questions
2 Framework of analysis
Data acquisition and pre-processing
User gender labelling
Language and sentiment analysis
3 Results
Emotions and status
Emotions and gender
Networked emotions
4 Conclusions
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 54 / 58
55. Conclusions
Administrators and experienced users play a pivotal role
they tend to interact especially with less experienced users
they promote a positive but impersonal environment
Men and women have a different communication style
women participate in discussions that have a more positive tone
men interact more with men, and women with women
women use a more emotional and relationship-oriented language
women admins have a relationship-oriented leadership style
⇒ promoting relationship-orientated leadership could lead to a more
positive environment
⇒ giving women more space in the community could result in a more
welcoming envoronment, for both women and men
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 55 / 58
56. Future work
Longitudinal analysis
how do emotional styles of editors change over time and with
increasing experience?
how do emotions in the discussions affect participation?
Qualitative analysis and human annotation
include non-textual emotional aspects such as emoticons, barn
stars and virtual gifts
deal with sarcasm, measure the extent of condescending or
paternalistic language in comments addressed at women editors
Examine other online spaces
Similar conclusions might hold for other online spaces
especially in discussions involving conflict, decision making and
power dynamics
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 56 / 58
57. Some references
M. M. Bradley and P. J .Lang.
Affective norms for English words (ANEW) Technical report C-1.
The Center for Research in Psychophysiology, University of Florida, FL, 2012.
B. Collier and J. Bear.
Conflict, criticism, or confidence: an empirical examination of the gender gap in wikipedia contributions.
In Proc. of CSCW, 2012.
Eom, Y.H., Aragón, P., Laniado, D., Kaltenbrunner, A., Vigna, S., Shepelyansky, D.L.: Interactions of cultures and top
people of wikipedia from ranking of 24 language editions.
PLoS ONE 10(3), e0114,825 (2015).
Iosub, D., Laniado, D., Castillo, C., Fuster Morell, M. F., and Kaltenbrunner, A. (2014)
Emotions under Discussion: Gender, Status and Communication in Online Collaboration.
Plos One, 9(8)
O. Kucuktunc, B. B. Cambazoglu, I. Weber, and H. Ferhatosmanoglu.
A large-scale sentiment analysis for Yahoo! answers.
In Proc. of WSDM, 2012.
D. Laniado, R. Tasso, Y. Volkovich, and A. Kaltenbrunner.
When the Wikipedians talk: Network and tree structure of Wikipedia discussion pages.
In Proc. of ICWSM, 2011.
Laniado, D., Castillo, C., Kaltenbrunner, A., and Fuster Morell, M. F. (2012)
Emotions and dialogue in a peer-production community: the case of Wikipedia.
8th International Symposium on Wikis and Open Collaboration, WikiSym’12
H. Zhu, R. Kraut, A. Kittur
Effectiveness of shared leadership in online communities.
In Proc. of CSCW, 2012.
David Laniado @sdivad Gender Gap in Collaborative Platforms: Language and emotions in Wikipedia Discussions 57 / 58