SlideShare a Scribd company logo
Building a Dataset to Measure
Toxicity and Social Bias within Language:
A Low-Resource Perspective
Won Ik Cho (SNU ECE)
2022. 6. 22 @FAccT, Seoul, Korea
Introduction
• CHO, Won Ik (조원익)
 B.S. in EE/Mathematics (SNU, ’10~’14)
 Ph.D. student (SNU ECE, ’14~)
• Academic interests
 Built Korean NLP datasets on various
spoken language understanding areas
 Currently interested in computational
approaches of:
• Dialogue analysis
• AI for social good
1
Contents
• Introduction
• Hate speech in real and cyber spaces
 What is hate speech and why does it matter?
 Study on hate speech detection
• In English – Dataset and analysis
• Notable approaches in other languages
• Low-resource perspective: Creating a hate speech corpus from
scratch
 Analysis on existing language resources
 Hate speech as bias detection and toxicity measurement
 Building a guideline for data annotation
 Worker pilot, crowdsourcing, and agreement
• Challenges of hate speech corpus construction
• Conclusion
2
Contents
Caution! This presenation may contain contents that can be offensive to
certain groups of people, such as gender bias, racism, or other
unethical contents including multimodal materials
3
Contents
• Handled in this tutorial
 How to build up a hate speech detection dataset in a specific setting
(language, text domain, etc.)
 How to check the validity of the created hate speech corpus
• Less handled in this tutorial
 Comprehensive definition of hate speech and social bias in the literature
 Reliability of specific ethical guideline for hate speech corpus construction
4
Hate speech in real and cyber spaces
• What is hate speech and why does it matter?
 Difficulty of defining hate speech
• Political and legal term, and not just a theoretical term
• Has no unified/universal definition accepted to all
• Definition differs upon language, culture, domain, discipline, etc.
 Definition given by United Nations
• “Any kind of communication in speech, writing or behaviour, that attacks or
uses pejorative or discriminatory language with reference to a person or a
group on the basis of who they are, in other words, based on their religion,
ethnicity, nationality, race, colour, descent, gender or other identity factor.”
– Not a legal definition
– Broader than the notion of “incitement to discrimination, hostility or violence”
prohibited under international human rights law
5
https://www.un.org/en/hate-speech/understanding-hate-speech/what-is-hate-speech?
Hate speech in real and cyber spaces
• What is hate speech and why does it matter?
 Hate speech in cyber spaces
• Definition is deductive, but its detection is inductive
• Hate speech appears online as various expressions, including:
– Offensive language
– Pejorative expressions
– Discriminative words
– Profanity terms
– Insulting ... etc.
• Whether to include specific terms or expressions in the category of `hate
speech’ is a tricky issue
– What if pejorative expression or profanity term does not target any group or
individuals?
– What if a (sexual) harrassment is considered offensive to readers but not for the
target figure?
6
Hate speech in real and cyber spaces
• Discussion on hate speech detection
 Studies for English
• Waseem and Hovy (2016)
– Annotates tweets upon around 10 features that make the post offensive
7
A tweet is offensive if it
1. uses a sexist or racial slur.
2. attacks a minority.
3. seeks to silence a minority.
4. criticizes a minority (without a well founded argument).
5. promotes, but does not directly use, hate speech or violent crime.
6. criticizes a minority and uses a straw man argument.
7. blatantly misrepresents truth or seeks to distort views on a minority with unfounded claims.
8. shows support of problematic hash tags. E.g. “#BanIslam”, “#whoriental”, “#whitegenocide”
9. negatively stereotypes a minority.
10. defends xenophobia or sexism.
11. contains a screen name that is offensive, as per the previous criteria, the tweet is
ambiguous (at best), and the tweet is on a topic that satisfies any of the above criteria
Waseem and Hovy, Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter, 2016.
Hate speech in real and cyber spaces
• Discussion on hate speech detection
 Studies for English
• Davidson et al. (2017)
– Mentions the discrepancy between the theoretical definition and real world
expressions of hate speech
– Puts `offensive’ expressions in between `hate’ and `non-hate’, to incorporate the
expressions that are in the grey area
– Incorporate profanity terms used
prevalent in social media, which
does not necessarily targets minority
but induces offensiveness
8
Davidson et al., Automated Hate Speech Detection and the Problem of Offensive Language, 2017.
Hate speech in real and cyber spaces
• Discussion on hate speech detection
 Notable approaches in other languages
• Sanguinetti et al. (2018)
– Investigates hate speech for the posts on Italian immigrants
– Beyond hate speech, tags if the post is offensive, aggressive, intensive, has irony and
sarcasm, shows stereotype
– `Stereotype’ as a factor that can be a clue to discrimination
9
Sanguinetti et al., An Italian Twitter Corpus of Hate Speech against Immigrants, 2018.
• hate speech: no - yes
• aggressiveness: no - weak – strong
• offensiveness: no - weak - strong
• irony: no - yes
• stereotype: no - yes
• intensity: 0 - 1 - 2 - 3 - 4
Hate speech in real and cyber spaces
• Discussion on hate speech detection
 Notable approaches in other languages
• Assimakopoulos et al. (2020)
– Motivated by the critical analysis of posts made in reaction to news reports on the
Mediterranean migration crisis and LGBTIQ+ matters in Malta
– Annotates Malta web texts
– Investigates the attitude (positive/negative) of the text, and asks for target if negative,
also asking the way the negativeness is conveyed
10
1. Does the post communicate a positive, negative or neutral attitude? [Positive / Negative / Neutral]
2. If negative, who does this attitude target? [Individual / Group]
• (a) If it targets an individual, does it do so because of the individual’s affiliation to a group? [Yes / No]
If yes, name the group.
• (b) If it targets a group, name the group.
3. How is the attitude expressed in relation to the target group? Select all that apply.
[ Derogatory term / Generalisation / Insult / Sarcasm (including jokes and trolling) / Stereotyping /
Suggestion / Threat ]
4. If the post involves a suggestion, is it a suggestion that calls for violence against the target group? [Yes / No]
Assimakopoulos et al., Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis, 2020.
Hate speech in real and cyber spaces
• Discussion on hate speech detection
 Notable approaches in other languages
• Moon et al. (2020)
– Annotation on Korean celebrity news comments
– Investigate the existence of social bias and the degree of toxicity
» Social bias – Gender-related bias and other biases
» Toxicity – Hate/Offensive/None (following Davidson et al. 2017)
11
Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020.
Detecting social bias
• Is there a gender-related bias, either explicit or implicit, in the text?
• Are there any other kinds of bias in the text?
• A comment that does not incorporate the bias
Measuring toxicity
• Is strong hate or insulting towards the article’s target or related
figures, writers of the article or comments, etc. displayed in a
comment?
• Although a comment is not as much hateful or insulting as the
above, does it make the target or the reader feel offended?
• A comment that does not incorporate any hatred or insulting
Low-resource perspective
• Creating a hate speech corpus from scratch
 ASSUMPTION: There is no manually created hate speech detection corpus
so far for the Korean language (was true before July 2020...)
• Generally, clear motivation is required for hate speech corpus construction
– Why?
» Takes resources (time and money)
» Potential mental harm
» Potential attack towards the researchers
– Nonetheless, it is required in some circumstances
» Detecting offensive language in services
» Severe harm has been displayed publicly
12
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 1: Is there anything available?
Analysis on existing language resources
• Language resources on hate speech detection regards various other similar
datasets (though slightly different in definition and goal)
– Dictionary of profanity terms (e.g., hatebase.org)
– Sarcasm detection dataset
– Sentiment analysis dataset
– Offensive language detection dataset
• Why we should search existing resources?
– To lessen the consumption of time and money
– To make the problem easier by building upon existing dataset
– To confirm what we should aim by creating a new dataset
13
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 1: Is there anything available?
Analysis on existing language resources
• Dictionary of profanity terms
– e.g., https://github.com/doublems/korean-bad-words
• Sarcasm detection dataset
– e.g, https://github.com/SpellOnYou/korean-sarcasm
• Sentiment analysis dataset
– e.g., https://github.com/e9t/nsmc
 The datasets may not completely overlap with hate speech corpus, but at
least they can be a good source of annotation 
• Here, one should think of:
– Text style
– Text domain
– Appearing types of toxicity and bias
14
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 1: Is there anything available?
Analysis on existing language resources
• Text style
– Written/spoken/web text?
• Text domain
– News/wiki/tweets/chat/comments?
• Appearing types of toxicity and bias
– Gender-related?
– Politics/religion?
– Region/nationality/ethnicity?
• Appearing amount of toxicity and bias
15
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 1: Is there anything available?
Analysis on existing language resources
• Data collection example (BEEP!)
– Comments from the most popular Korean entertainment news platform
» Jan. 2018 ~ Feb. 2020
» 10,403,368 comments from 23,700 articles
» 1,580 articles acquired by stratified sampling
» Top 20 comments in the order of Wilson score on the downvote for each article
– Filter the duplicates and leave comments having more than single token and less
than 100 characters
– 10K comments were selected
• Data sampling process matters much in the final distribution of the dataset!
16
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
• Local definition of hate speech discussed by Korean sociolinguistics society
– Definition of hate speech
» Expressions that discriminate/hate or incite discrimination/hatred/violence
towards some individual or group of people because they have characteristics
as a social minority
– Types of hate speech
» Discriminative bullying
» Discrimination
» Public insult/threatening
» Inciting hatred
17
Hong et al., Study on the State and Regulation of Hate Speech, 2016.
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
• Set up criteria
– Analyze ‘Discriminate/hate or incite discrimination/hatred/violence’ as a combination
of ‘Social bias’ and ’Toxicity’
– Further discussion required on social minority
» `Gender, age, profession, religion, nationality, skin color, political stance’ and all
other factors that comprises one’s identity
» Criteria for social minority vs. Who will be acknowledged as social minority
18
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
• Set up criteria for bias detection
– `People with a specific characteristic may behave in some way’
– Differs from the judgment
» Gender-related bias
» Other biases
» None
19
Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments, 2021.
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
• Set up criteria for toxicity measurement
– Hate
» Hostility towards a specific group or individual
» Can be represented by some profanity terms, but terms do not imply hate
– Insult
» Expressions that can harm the prestige of individuals or group
» Various profanity terms are included
– Offensive expressions
» Does not count as hate or insult, but may make the readers offensive
» Includes sarcasm, irony, bad guessing, unethical expressions
20
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
• Set up criteria for toxicity measurement
» Severe hate or insult
» Not hateful but offensive or sarcastic
» None
21
Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments, 2021.
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 3: What is required for the annotation?
Building a guideline for data annotation
• Stakeholders
– Researchers
– Moderators (crowdsourcing platform)
– Workers
• How is guideline used as?
– Setting up research direction (for researchers)
– Task understanding (for moderators)
– Data annotation (for workers)
22
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 3: What is required for the annotation?
Building a guideline for data annotation
• Guideline is not built at once!
– Usual process
» Making up draft guideline based on source corpus
» Pilot study of researchers & guideline update (𝑁 times iteration)
» Moderators’ and researchers’ alignment on the guideline
» Worker recruitment & pilot tagging
» Guideline update with worker feedback (cautions & exceptions)
» Final guideline (for main annotation)
23
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 3: What is required for the annotation?
Building a guideline for data annotation
• Draft guideline
– Built based upon a small portion of source corpus (about hundreds of instances)
– Researchers’ intuition is highly involved in
– Concept-based description
» e.g., for `bias’,
`People with a specific characteristic may behave in some way’
(instead of listing up all stereotyped expressions)
• Pilot study
– Researchers’ tagging on slightly larger portion of source corpus (~1K instances)
– Fitting researchers’ intuition on the proposed concepts
» e.g., ``Does this expression contain bias or toxicity?’’
(discussion is important, but don’t fight!)
– Update descriptions or add examples
– Labeling, re-labeling, re-re-labeling...
24
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 3: What is required for the annotation?
Building a guideline for data annotation
• Pilot study
– Labeling, re-labeling, re-re-labeling... + Agreement?
– Inter-annotator agreement (IAA)
» Calculating the reliability of annotation
» Cohen’s Kappa for two annotators
» Fleiss’ Kappa for more than two annotators
– Sufficiently high agreement? (> 0.6?)
» Let’s go annotating in the wild!
25
Pustejovsky and Stubbs, Natural Language Annotation, 2012.
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
• Finding a crowdsourcing platform
– Moderator
» Usually an expert in data creation and management
» Comprehends the task, gives feedback in view of workers
» Helps communication between researchers and workers
» Instructs, and sometimes hurries workers to meet the timeline
» Manages financial or legal issues
» Let researchers concentrate on the task itself
– Without moderator?
» Researchers are the moderator!
(Unless there are some automated functions in the platform)
– With moderator?
» The closest partner of researchers
26
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
• Finding a crowdsourcing platform
– Existence and experience of the moderator
» Experience of similar dataset construction
» Comprehension of the task & proper feedbacks
» Sufficient worker pool
» Trust between the moderator and workers
– Reasonable cost estimation
» Appropriateness of price per tagging or reviewing
» Appropriateness of worker compensation
» Fit with the budget
27
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
• Finding a crowdsourcing platform
– Usefulness of the platform UI
» Progress status (In progress, Submitted, Waiting for reviewing... etc.)
» Statistics: The number of workers and reviewers, Average work/review duration...
» Demographics, Worker history by individuals & in total...
28
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
• Pilot tagging (by workers)
– Goal of worker pilot
» Guideline update in workers’ view (especially on cautions & exceptions)
» Worker selection
– Procedure
» Advertisement or recruitment
» Worker tagging
» Researchers’ (or moderators’) review & rejection
» Workers’ revise & resubmit
29
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
• Details on worker selection process?
– Human checking
» Ethical standard not too far from the guideline?
» Is feedback effective for the rejected samples?
– Automatic checking
» Enough taggings done?
» Too frequent cases of skipping the annotation?
30
UI screenshots provided by Deep Natural AI.
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
• Crowdsourcing: A simplified version is required for crowd annotation!
– Multi-class, multi-attribute tagging
» 3 classes for bias
» 3 classes for toxicity
– Given a comment (without context), the annotator should tag each attribute
– Detailed guideline (with examples, cautions, and exceptions) is provided separately
31
1. What kind of bias does the comment contain?
- Gender bias, Other biases, or None
2. Which is the adequate category for the comment in terms of toxicity?
- Hate, Offensive, or None
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
• Main annotation
– Based on the final version of the guideline
» 3~5 annotators (per sample) for usual classification tasks
– Tagging done by selected workers
» Worker selection and education
» Short quiz (if workers are not selected)
– Annotation toolkit
» Assign samples randomly to workers, with multiple annotators per sample
» Interface developed or provided by the platform (usually takes budget)
» Open-source interfaces (e.g., Labelstudio)
– Data check for further guarantee of quality
» If sufficiently many annotators per sample?
» If not...?
32
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
• Data selection after main annotation (8,000 samples)
– Data reviewing strategy may differ by subtask
– Researchers decide the final label after adjudication
– Common for bias and toxicity
» Cases with all three annotators differ
– Only for toxicity
» Since the problem regards the continuum of degree,
cases with only hate (o) and none (x) need to be investigated again
– Failure for decision (unable to majority vote) – discarded
33
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
• Final decision
– Test: 974
» Data tagged while constructing the guideline
(Mostly adjust to the intention of the guideline)
– Validation: 471
» Data which went through tag/review/reject
and accept in the pilot phase,
done with a large number of annotators
(Roughly aligned with the guideline)
– Train: 7,896
» Data which were crowd-sourced with the
selected workers, not reviewed totally but
went through adjudication only for some special cases
• Agreement
– 0.492 for bias detection, 0.496 for toxicity measurement
34
Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020.
Low-resource perspective
• Creating a hate speech corpus from scratch
 Step 4: How is the annotation process conducted and evaluated?
Beyond creation - Model training and deployment
• Model training
– Traditionally
» High performance – relatively easy?
» Low performance – relatively challenging?
– But in PLM-based training these days...
» Pretraining corpora
» Model size
» Model architecture
– Model deployment
» Performance & size
» User feedbacks
35
Yang, Transformer-based Korean Pretrained Language Models: A Survey on Three Years of Progress, 2021.
Challenges
• Challenges of hate speech corpus construction
 Context-dependency
• News comment – articles
• Tweets – threads
• Web community comments – posts
 Multi-modal or noisy inputs
• Image and audio
– Kiela et al. (2020)
- Hateful memes challenge
• Perturbated texts
– Cho and Kim
(2021)
- Leetcodes
- Yaminjeongeum
36
Kiela et al., The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes, 2020.
Cho and Kim, Google-trickers, Yaminjeongeum, and Leetspeak: An Empirical Taxonomy for Intentionally Noisy User-Generated Text, 2021.
Challenges
• Challenges of hate speech corpus construction
 Categorical or binary output has limitation
• Limitation of categorizing the degree of intensity
– Hate/offensive/none categorization is sub-optimal
– Polleto et al. (2019)
Scale-based annotation:
Unbalanced Rating Scale
» Used to determine the label
(or used as a target score?)
37
Poletto et al., Annotating Hate Speech: Three Schemes at Comparison, 2019.
Challenges
• Challenges of hate speech corpus construction
 Annotation requires multiple label
• Aspect of discrimination may differ by attributes
– Gender, Race, Nationality, Ageism ...
• Tagging `all the target attributes’ that appear?
– Kang et al. (2022)
» Detailed guideline with terms and concepts defined for each atttribute
38
Women
& family
Male Sexual
minorities
Race &
nationality
Ageism Regionalism Religion Other Malicious None
S1 1 0 0 0 1 0 0 0 0 0
S2 0 0 0 0 0 0 0 0 1 0
S3 0 0 0 1 0 0 1 0 0 0
S4 0 0 0 0 0 0 0 0 0 1
Kang et al., Korean Online Hate Speech Dataset for Multilabel Classification - How Can Social Science Improve Dataset on Hate Speech?, 2022.
Challenges
• Challenges of hate speech corpus construction
 Privacy and license issues
• Privacy and license can be violated with the text crawling
• Hate speech corpus may contain personal information on (public) figures
• Text could have been brought from elsewhere (copy & paste)
 How about creating hate (and non-hate) speech from scratch?
• Yang et al. (2022): Recruit workers and enable `anonymous’ text generation!
39
Yang et al., APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets, 2022.
Challenges
• Ambiguity is inevitable
 Text may incorporate various ways of interpretation
• Text accompanies omission or replacement to trick the monitoring
• Intention is apparent considering the context
• Temporal diachronicity of hate speech
 Non-hate speech in the past can be interpreted as hate speech these days
 Diachronicity may deter the utility of prediction systems
• e.g., [a name of celebrity who commited crime] before 20xx / after 20xx
• Boundary of hate speech and freedom of speech
 Grey area that cannot be resolved
• Some readers are offended by false positives
• Some users are offended by false negatives
40
Conclusion
• Hate speech prevalent in real and cyber spaces
 Discussions on hate speech have diverse viewpoints, from academia to
society and industry – and they are reflected to the dataset construction
• No corpus is built perfect from the beginning
 ... and hate speech is one of the most difficult kind of corpus to create
• Considerations in low-resource hate speech corpus construction
 Why? How? How much? How well?
• Still more challenges left
 Context, input noise, output format, indecisiveness ...
• Takeaways
 There is discrepancy between theoretical and practical definition of hate
speech, and their aim may differ
 There is no hate speech detection guideline that satisfies ALL, so let’s find
the boundary that satisfies the most and improve it
41
Reference
• Waseem and Hovy, Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter, 2016.
• Davidson et al., Automated Hate Speech Detection and the Problem of Offensive Language, 2017.
• Sanguinetti et al., An Italian Twitter Corpus of Hate Speech against Immigrants, 2018.
• Assimakopoulos et al., Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis, 2020.
• Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020.
• Hong et al., Study on the State and Regulation of Hate Speech, 2016.
• Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News
Comments, 2021.
• Pustejovsky and Stubbs, Natural Language Annotation, 2012.
• Yang, Transformer-based Korean Pretrained Language Models: A Survey on Three Years of Progress, 2021.
• Kiela et al., The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes, 2020.
• Cho and Kim, Google-trickers, Yaminjeongeum, and Leetspeak: An Empirical Taxonomy for Intentionally Noisy User-Generated
Text, 2021.
• Poletto et al., Annotating Hate Speech: Three Schemes at Comparison, 2019.
• Kang et al., Korean Online Hate Speech Dataset for Multilabel Classification - How Can Social Science Improve Dataset on Hate
Speech?, 2022.
• Yang et al., APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets, 2022.
42
Thank you!
EndOfPresentation

More Related Content

Similar to 2206 FAccT_inperson

Approaches to studying language attitudes beyond labov
Approaches to studying language attitudes  beyond labovApproaches to studying language attitudes  beyond labov
Approaches to studying language attitudes beyond labov
Jacqueline Trademan
 
Week 11 english 145
Week 11 english 145 Week 11 english 145
Week 11 english 145 lisyaseloni
 
Introduction cda pid2012
Introduction cda pid2012Introduction cda pid2012
Introduction cda pid2012Francesca Helm
 
Chp 5 ethical public speaking
Chp 5 ethical public speakingChp 5 ethical public speaking
Chp 5 ethical public speaking
ProfessorEvans
 
Chp 5 ethical public speaking
Chp 5 ethical public speakingChp 5 ethical public speaking
Chp 5 ethical public speaking
ProfessorEvans
 
Discourse Analysis - Project Instructions
Discourse Analysis - Project InstructionsDiscourse Analysis - Project Instructions
Discourse Analysis - Project Instructions
Ashwag Al Hamid
 
Resources_Guide
Resources_GuideResources_Guide
Resources_GuideSajid Butt
 
Liking violence: A study of hate speech on Facebook in Sri Lanka
Liking violence: A study of hate speech on Facebook in Sri LankaLiking violence: A study of hate speech on Facebook in Sri Lanka
Liking violence: A study of hate speech on Facebook in Sri Lanka
Sanjana Hattotuwa
 
Discourse communities -authorityanddata-1
Discourse communities -authorityanddata-1Discourse communities -authorityanddata-1
Discourse communities -authorityanddata-1Laura Martinez
 
4. audience reception
4. audience reception4. audience reception
4. audience reception
Mike Gunn
 
Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010
Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010
Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010
TEST Huddle
 
A Survey of ‘Bias’ in Natural Language Processing Systems
A Survey of ‘Bias’ in Natural Language Processing SystemsA Survey of ‘Bias’ in Natural Language Processing Systems
A Survey of ‘Bias’ in Natural Language Processing Systems
subarna89
 
Audiencetheory
AudiencetheoryAudiencetheory
Audiencetheory
Emma Leslie
 
2 evaluating participatory communication
2 evaluating participatory communication2 evaluating participatory communication
2 evaluating participatory communication
Sheeva Dubey
 
Multimedia Academic Literacy
Multimedia Academic LiteracyMultimedia Academic Literacy
Multimedia Academic Literacy
Spelman College
 
Week 12 english 145
Week 12 english 145Week 12 english 145
Week 12 english 145lisyaseloni
 
Global education
Global educationGlobal education
Global education
Mohammed Fawaz
 
Sentiment Analysis with NVivo 11 Plus
Sentiment Analysis with NVivo 11 PlusSentiment Analysis with NVivo 11 Plus
Sentiment Analysis with NVivo 11 Plus
Shalin Hai-Jew
 
Learning in the Wild: Coding Reddit for Learning and Practice
Learning in the Wild: Coding Reddit for Learning and PracticeLearning in the Wild: Coding Reddit for Learning and Practice
Learning in the Wild: Coding Reddit for Learning and Practice
Priya Kumar
 

Similar to 2206 FAccT_inperson (20)

Week 9 145
Week 9 145 Week 9 145
Week 9 145
 
Approaches to studying language attitudes beyond labov
Approaches to studying language attitudes  beyond labovApproaches to studying language attitudes  beyond labov
Approaches to studying language attitudes beyond labov
 
Week 11 english 145
Week 11 english 145 Week 11 english 145
Week 11 english 145
 
Introduction cda pid2012
Introduction cda pid2012Introduction cda pid2012
Introduction cda pid2012
 
Chp 5 ethical public speaking
Chp 5 ethical public speakingChp 5 ethical public speaking
Chp 5 ethical public speaking
 
Chp 5 ethical public speaking
Chp 5 ethical public speakingChp 5 ethical public speaking
Chp 5 ethical public speaking
 
Discourse Analysis - Project Instructions
Discourse Analysis - Project InstructionsDiscourse Analysis - Project Instructions
Discourse Analysis - Project Instructions
 
Resources_Guide
Resources_GuideResources_Guide
Resources_Guide
 
Liking violence: A study of hate speech on Facebook in Sri Lanka
Liking violence: A study of hate speech on Facebook in Sri LankaLiking violence: A study of hate speech on Facebook in Sri Lanka
Liking violence: A study of hate speech on Facebook in Sri Lanka
 
Discourse communities -authorityanddata-1
Discourse communities -authorityanddata-1Discourse communities -authorityanddata-1
Discourse communities -authorityanddata-1
 
4. audience reception
4. audience reception4. audience reception
4. audience reception
 
Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010
Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010
Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010
 
A Survey of ‘Bias’ in Natural Language Processing Systems
A Survey of ‘Bias’ in Natural Language Processing SystemsA Survey of ‘Bias’ in Natural Language Processing Systems
A Survey of ‘Bias’ in Natural Language Processing Systems
 
Audiencetheory
AudiencetheoryAudiencetheory
Audiencetheory
 
2 evaluating participatory communication
2 evaluating participatory communication2 evaluating participatory communication
2 evaluating participatory communication
 
Multimedia Academic Literacy
Multimedia Academic LiteracyMultimedia Academic Literacy
Multimedia Academic Literacy
 
Week 12 english 145
Week 12 english 145Week 12 english 145
Week 12 english 145
 
Global education
Global educationGlobal education
Global education
 
Sentiment Analysis with NVivo 11 Plus
Sentiment Analysis with NVivo 11 PlusSentiment Analysis with NVivo 11 Plus
Sentiment Analysis with NVivo 11 Plus
 
Learning in the Wild: Coding Reddit for Learning and Practice
Learning in the Wild: Coding Reddit for Learning and PracticeLearning in the Wild: Coding Reddit for Learning and Practice
Learning in the Wild: Coding Reddit for Learning and Practice
 

More from WarNik Chow

2312 PACLIC
2312 PACLIC2312 PACLIC
2312 PACLIC
WarNik Chow
 
2311 EAAMO
2311 EAAMO2311 EAAMO
2311 EAAMO
WarNik Chow
 
2211 HCOMP
2211 HCOMP2211 HCOMP
2211 HCOMP
WarNik Chow
 
2211 APSIPA
2211 APSIPA2211 APSIPA
2211 APSIPA
WarNik Chow
 
2211 AACL
2211 AACL2211 AACL
2211 AACL
WarNik Chow
 
2210 CODI
2210 CODI2210 CODI
2210 CODI
WarNik Chow
 
2206 Modupop!
2206 Modupop!2206 Modupop!
2206 Modupop!
WarNik Chow
 
2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset
WarNik Chow
 
2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e
WarNik Chow
 
2106 PRSLLS
2106 PRSLLS2106 PRSLLS
2106 PRSLLS
WarNik Chow
 
2106 ACM DIS
2106 ACM DIS2106 ACM DIS
2106 ACM DIS
WarNik Chow
 
2104 Talk @SSU
2104 Talk @SSU2104 Talk @SSU
2104 Talk @SSU
WarNik Chow
 
2103 ACM FAccT
2103 ACM FAccT2103 ACM FAccT
2103 ACM FAccT
WarNik Chow
 
2102 Redone seminar
2102 Redone seminar2102 Redone seminar
2102 Redone seminar
WarNik Chow
 
2011 NLP-OSS
2011 NLP-OSS2011 NLP-OSS
2011 NLP-OSS
WarNik Chow
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH
WarNik Chow
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
WarNik Chow
 
2010 HCLT Hate Speech
2010 HCLT Hate Speech2010 HCLT Hate Speech
2010 HCLT Hate Speech
WarNik Chow
 
2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP
WarNik Chow
 
2008 [lang con2020] act!
2008 [lang con2020] act!2008 [lang con2020] act!
2008 [lang con2020] act!
WarNik Chow
 

More from WarNik Chow (20)

2312 PACLIC
2312 PACLIC2312 PACLIC
2312 PACLIC
 
2311 EAAMO
2311 EAAMO2311 EAAMO
2311 EAAMO
 
2211 HCOMP
2211 HCOMP2211 HCOMP
2211 HCOMP
 
2211 APSIPA
2211 APSIPA2211 APSIPA
2211 APSIPA
 
2211 AACL
2211 AACL2211 AACL
2211 AACL
 
2210 CODI
2210 CODI2210 CODI
2210 CODI
 
2206 Modupop!
2206 Modupop!2206 Modupop!
2206 Modupop!
 
2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset
 
2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e
 
2106 PRSLLS
2106 PRSLLS2106 PRSLLS
2106 PRSLLS
 
2106 ACM DIS
2106 ACM DIS2106 ACM DIS
2106 ACM DIS
 
2104 Talk @SSU
2104 Talk @SSU2104 Talk @SSU
2104 Talk @SSU
 
2103 ACM FAccT
2103 ACM FAccT2103 ACM FAccT
2103 ACM FAccT
 
2102 Redone seminar
2102 Redone seminar2102 Redone seminar
2102 Redone seminar
 
2011 NLP-OSS
2011 NLP-OSS2011 NLP-OSS
2011 NLP-OSS
 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
 
2010 HCLT Hate Speech
2010 HCLT Hate Speech2010 HCLT Hate Speech
2010 HCLT Hate Speech
 
2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP
 
2008 [lang con2020] act!
2008 [lang con2020] act!2008 [lang con2020] act!
2008 [lang con2020] act!
 

Recently uploaded

Social Media Marketing Strategies .
Social Media Marketing Strategies                     .Social Media Marketing Strategies                     .
Social Media Marketing Strategies .
Virtual Real Design
 
Buy Pinterest Followers, Reactions & Repins Go Viral on Pinterest with Socio...
Buy Pinterest Followers, Reactions & Repins  Go Viral on Pinterest with Socio...Buy Pinterest Followers, Reactions & Repins  Go Viral on Pinterest with Socio...
Buy Pinterest Followers, Reactions & Repins Go Viral on Pinterest with Socio...
SocioCosmos
 
Surat Digital Marketing School - course curriculum
Surat Digital Marketing School - course curriculumSurat Digital Marketing School - course curriculum
Surat Digital Marketing School - course curriculum
digitalcourseshop4
 
“To be integrated is to feel secure, to feel connected.” The views and experi...
“To be integrated is to feel secure, to feel connected.” The views and experi...“To be integrated is to feel secure, to feel connected.” The views and experi...
“To be integrated is to feel secure, to feel connected.” The views and experi...
AJHSSR Journal
 
Unlock TikTok Success with Sociocosmos..
Unlock TikTok Success with Sociocosmos..Unlock TikTok Success with Sociocosmos..
Unlock TikTok Success with Sociocosmos..
SocioCosmos
 
Your Path to YouTube Stardom Starts Here
Your Path to YouTube Stardom Starts HereYour Path to YouTube Stardom Starts Here
Your Path to YouTube Stardom Starts Here
SocioCosmos
 
SluggerPunk Final Angel Investor Proposal
SluggerPunk Final Angel Investor ProposalSluggerPunk Final Angel Investor Proposal
SluggerPunk Final Angel Investor Proposal
grogshiregames
 
SluggerPunk Angel Investor Final Proposal
SluggerPunk Angel Investor Final ProposalSluggerPunk Angel Investor Final Proposal
SluggerPunk Angel Investor Final Proposal
grogshiregames
 
7 Tips on Social Media Marketing strategy
7 Tips on Social Media Marketing strategy7 Tips on Social Media Marketing strategy
7 Tips on Social Media Marketing strategy
Digital Marketing Lab
 
Improving Workplace Safety Performance in Malaysian SMEs: The Role of Safety ...
Improving Workplace Safety Performance in Malaysian SMEs: The Role of Safety ...Improving Workplace Safety Performance in Malaysian SMEs: The Role of Safety ...
Improving Workplace Safety Performance in Malaysian SMEs: The Role of Safety ...
AJHSSR Journal
 
Multilingual SEO Services | Multilingual Keyword Research | Filose
Multilingual SEO Services |  Multilingual Keyword Research | FiloseMultilingual SEO Services |  Multilingual Keyword Research | Filose
Multilingual SEO Services | Multilingual Keyword Research | Filose
madisonsmith478075
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLO
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLOLORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLO
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLO
lorraineandreiamcidl
 
Grow Your Reddit Community Fast.........
Grow Your Reddit Community Fast.........Grow Your Reddit Community Fast.........
Grow Your Reddit Community Fast.........
SocioCosmos
 

Recently uploaded (13)

Social Media Marketing Strategies .
Social Media Marketing Strategies                     .Social Media Marketing Strategies                     .
Social Media Marketing Strategies .
 
Buy Pinterest Followers, Reactions & Repins Go Viral on Pinterest with Socio...
Buy Pinterest Followers, Reactions & Repins  Go Viral on Pinterest with Socio...Buy Pinterest Followers, Reactions & Repins  Go Viral on Pinterest with Socio...
Buy Pinterest Followers, Reactions & Repins Go Viral on Pinterest with Socio...
 
Surat Digital Marketing School - course curriculum
Surat Digital Marketing School - course curriculumSurat Digital Marketing School - course curriculum
Surat Digital Marketing School - course curriculum
 
“To be integrated is to feel secure, to feel connected.” The views and experi...
“To be integrated is to feel secure, to feel connected.” The views and experi...“To be integrated is to feel secure, to feel connected.” The views and experi...
“To be integrated is to feel secure, to feel connected.” The views and experi...
 
Unlock TikTok Success with Sociocosmos..
Unlock TikTok Success with Sociocosmos..Unlock TikTok Success with Sociocosmos..
Unlock TikTok Success with Sociocosmos..
 
Your Path to YouTube Stardom Starts Here
Your Path to YouTube Stardom Starts HereYour Path to YouTube Stardom Starts Here
Your Path to YouTube Stardom Starts Here
 
SluggerPunk Final Angel Investor Proposal
SluggerPunk Final Angel Investor ProposalSluggerPunk Final Angel Investor Proposal
SluggerPunk Final Angel Investor Proposal
 
SluggerPunk Angel Investor Final Proposal
SluggerPunk Angel Investor Final ProposalSluggerPunk Angel Investor Final Proposal
SluggerPunk Angel Investor Final Proposal
 
7 Tips on Social Media Marketing strategy
7 Tips on Social Media Marketing strategy7 Tips on Social Media Marketing strategy
7 Tips on Social Media Marketing strategy
 
Improving Workplace Safety Performance in Malaysian SMEs: The Role of Safety ...
Improving Workplace Safety Performance in Malaysian SMEs: The Role of Safety ...Improving Workplace Safety Performance in Malaysian SMEs: The Role of Safety ...
Improving Workplace Safety Performance in Malaysian SMEs: The Role of Safety ...
 
Multilingual SEO Services | Multilingual Keyword Research | Filose
Multilingual SEO Services |  Multilingual Keyword Research | FiloseMultilingual SEO Services |  Multilingual Keyword Research | Filose
Multilingual SEO Services | Multilingual Keyword Research | Filose
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLO
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLOLORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLO
LORRAINE ANDREI_LEQUIGAN_HOW TO USE TRELLO
 
Grow Your Reddit Community Fast.........
Grow Your Reddit Community Fast.........Grow Your Reddit Community Fast.........
Grow Your Reddit Community Fast.........
 

2206 FAccT_inperson

  • 1. Building a Dataset to Measure Toxicity and Social Bias within Language: A Low-Resource Perspective Won Ik Cho (SNU ECE) 2022. 6. 22 @FAccT, Seoul, Korea
  • 2. Introduction • CHO, Won Ik (조원익)  B.S. in EE/Mathematics (SNU, ’10~’14)  Ph.D. student (SNU ECE, ’14~) • Academic interests  Built Korean NLP datasets on various spoken language understanding areas  Currently interested in computational approaches of: • Dialogue analysis • AI for social good 1
  • 3. Contents • Introduction • Hate speech in real and cyber spaces  What is hate speech and why does it matter?  Study on hate speech detection • In English – Dataset and analysis • Notable approaches in other languages • Low-resource perspective: Creating a hate speech corpus from scratch  Analysis on existing language resources  Hate speech as bias detection and toxicity measurement  Building a guideline for data annotation  Worker pilot, crowdsourcing, and agreement • Challenges of hate speech corpus construction • Conclusion 2
  • 4. Contents Caution! This presenation may contain contents that can be offensive to certain groups of people, such as gender bias, racism, or other unethical contents including multimodal materials 3
  • 5. Contents • Handled in this tutorial  How to build up a hate speech detection dataset in a specific setting (language, text domain, etc.)  How to check the validity of the created hate speech corpus • Less handled in this tutorial  Comprehensive definition of hate speech and social bias in the literature  Reliability of specific ethical guideline for hate speech corpus construction 4
  • 6. Hate speech in real and cyber spaces • What is hate speech and why does it matter?  Difficulty of defining hate speech • Political and legal term, and not just a theoretical term • Has no unified/universal definition accepted to all • Definition differs upon language, culture, domain, discipline, etc.  Definition given by United Nations • “Any kind of communication in speech, writing or behaviour, that attacks or uses pejorative or discriminatory language with reference to a person or a group on the basis of who they are, in other words, based on their religion, ethnicity, nationality, race, colour, descent, gender or other identity factor.” – Not a legal definition – Broader than the notion of “incitement to discrimination, hostility or violence” prohibited under international human rights law 5 https://www.un.org/en/hate-speech/understanding-hate-speech/what-is-hate-speech?
  • 7. Hate speech in real and cyber spaces • What is hate speech and why does it matter?  Hate speech in cyber spaces • Definition is deductive, but its detection is inductive • Hate speech appears online as various expressions, including: – Offensive language – Pejorative expressions – Discriminative words – Profanity terms – Insulting ... etc. • Whether to include specific terms or expressions in the category of `hate speech’ is a tricky issue – What if pejorative expression or profanity term does not target any group or individuals? – What if a (sexual) harrassment is considered offensive to readers but not for the target figure? 6
  • 8. Hate speech in real and cyber spaces • Discussion on hate speech detection  Studies for English • Waseem and Hovy (2016) – Annotates tweets upon around 10 features that make the post offensive 7 A tweet is offensive if it 1. uses a sexist or racial slur. 2. attacks a minority. 3. seeks to silence a minority. 4. criticizes a minority (without a well founded argument). 5. promotes, but does not directly use, hate speech or violent crime. 6. criticizes a minority and uses a straw man argument. 7. blatantly misrepresents truth or seeks to distort views on a minority with unfounded claims. 8. shows support of problematic hash tags. E.g. “#BanIslam”, “#whoriental”, “#whitegenocide” 9. negatively stereotypes a minority. 10. defends xenophobia or sexism. 11. contains a screen name that is offensive, as per the previous criteria, the tweet is ambiguous (at best), and the tweet is on a topic that satisfies any of the above criteria Waseem and Hovy, Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter, 2016.
  • 9. Hate speech in real and cyber spaces • Discussion on hate speech detection  Studies for English • Davidson et al. (2017) – Mentions the discrepancy between the theoretical definition and real world expressions of hate speech – Puts `offensive’ expressions in between `hate’ and `non-hate’, to incorporate the expressions that are in the grey area – Incorporate profanity terms used prevalent in social media, which does not necessarily targets minority but induces offensiveness 8 Davidson et al., Automated Hate Speech Detection and the Problem of Offensive Language, 2017.
  • 10. Hate speech in real and cyber spaces • Discussion on hate speech detection  Notable approaches in other languages • Sanguinetti et al. (2018) – Investigates hate speech for the posts on Italian immigrants – Beyond hate speech, tags if the post is offensive, aggressive, intensive, has irony and sarcasm, shows stereotype – `Stereotype’ as a factor that can be a clue to discrimination 9 Sanguinetti et al., An Italian Twitter Corpus of Hate Speech against Immigrants, 2018. • hate speech: no - yes • aggressiveness: no - weak – strong • offensiveness: no - weak - strong • irony: no - yes • stereotype: no - yes • intensity: 0 - 1 - 2 - 3 - 4
  • 11. Hate speech in real and cyber spaces • Discussion on hate speech detection  Notable approaches in other languages • Assimakopoulos et al. (2020) – Motivated by the critical analysis of posts made in reaction to news reports on the Mediterranean migration crisis and LGBTIQ+ matters in Malta – Annotates Malta web texts – Investigates the attitude (positive/negative) of the text, and asks for target if negative, also asking the way the negativeness is conveyed 10 1. Does the post communicate a positive, negative or neutral attitude? [Positive / Negative / Neutral] 2. If negative, who does this attitude target? [Individual / Group] • (a) If it targets an individual, does it do so because of the individual’s affiliation to a group? [Yes / No] If yes, name the group. • (b) If it targets a group, name the group. 3. How is the attitude expressed in relation to the target group? Select all that apply. [ Derogatory term / Generalisation / Insult / Sarcasm (including jokes and trolling) / Stereotyping / Suggestion / Threat ] 4. If the post involves a suggestion, is it a suggestion that calls for violence against the target group? [Yes / No] Assimakopoulos et al., Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis, 2020.
  • 12. Hate speech in real and cyber spaces • Discussion on hate speech detection  Notable approaches in other languages • Moon et al. (2020) – Annotation on Korean celebrity news comments – Investigate the existence of social bias and the degree of toxicity » Social bias – Gender-related bias and other biases » Toxicity – Hate/Offensive/None (following Davidson et al. 2017) 11 Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020. Detecting social bias • Is there a gender-related bias, either explicit or implicit, in the text? • Are there any other kinds of bias in the text? • A comment that does not incorporate the bias Measuring toxicity • Is strong hate or insulting towards the article’s target or related figures, writers of the article or comments, etc. displayed in a comment? • Although a comment is not as much hateful or insulting as the above, does it make the target or the reader feel offended? • A comment that does not incorporate any hatred or insulting
  • 13. Low-resource perspective • Creating a hate speech corpus from scratch  ASSUMPTION: There is no manually created hate speech detection corpus so far for the Korean language (was true before July 2020...) • Generally, clear motivation is required for hate speech corpus construction – Why? » Takes resources (time and money) » Potential mental harm » Potential attack towards the researchers – Nonetheless, it is required in some circumstances » Detecting offensive language in services » Severe harm has been displayed publicly 12
  • 14. Low-resource perspective • Creating a hate speech corpus from scratch  Step 1: Is there anything available? Analysis on existing language resources • Language resources on hate speech detection regards various other similar datasets (though slightly different in definition and goal) – Dictionary of profanity terms (e.g., hatebase.org) – Sarcasm detection dataset – Sentiment analysis dataset – Offensive language detection dataset • Why we should search existing resources? – To lessen the consumption of time and money – To make the problem easier by building upon existing dataset – To confirm what we should aim by creating a new dataset 13
  • 15. Low-resource perspective • Creating a hate speech corpus from scratch  Step 1: Is there anything available? Analysis on existing language resources • Dictionary of profanity terms – e.g., https://github.com/doublems/korean-bad-words • Sarcasm detection dataset – e.g, https://github.com/SpellOnYou/korean-sarcasm • Sentiment analysis dataset – e.g., https://github.com/e9t/nsmc  The datasets may not completely overlap with hate speech corpus, but at least they can be a good source of annotation  • Here, one should think of: – Text style – Text domain – Appearing types of toxicity and bias 14
  • 16. Low-resource perspective • Creating a hate speech corpus from scratch  Step 1: Is there anything available? Analysis on existing language resources • Text style – Written/spoken/web text? • Text domain – News/wiki/tweets/chat/comments? • Appearing types of toxicity and bias – Gender-related? – Politics/religion? – Region/nationality/ethnicity? • Appearing amount of toxicity and bias 15
  • 17. Low-resource perspective • Creating a hate speech corpus from scratch  Step 1: Is there anything available? Analysis on existing language resources • Data collection example (BEEP!) – Comments from the most popular Korean entertainment news platform » Jan. 2018 ~ Feb. 2020 » 10,403,368 comments from 23,700 articles » 1,580 articles acquired by stratified sampling » Top 20 comments in the order of Wilson score on the downvote for each article – Filter the duplicates and leave comments having more than single token and less than 100 characters – 10K comments were selected • Data sampling process matters much in the final distribution of the dataset! 16
  • 18. Low-resource perspective • Creating a hate speech corpus from scratch  Step 2: What should we define first? Hate speech as bias detection and toxicity measurement • Local definition of hate speech discussed by Korean sociolinguistics society – Definition of hate speech » Expressions that discriminate/hate or incite discrimination/hatred/violence towards some individual or group of people because they have characteristics as a social minority – Types of hate speech » Discriminative bullying » Discrimination » Public insult/threatening » Inciting hatred 17 Hong et al., Study on the State and Regulation of Hate Speech, 2016.
  • 19. Low-resource perspective • Creating a hate speech corpus from scratch  Step 2: What should we define first? Hate speech as bias detection and toxicity measurement • Set up criteria – Analyze ‘Discriminate/hate or incite discrimination/hatred/violence’ as a combination of ‘Social bias’ and ’Toxicity’ – Further discussion required on social minority » `Gender, age, profession, religion, nationality, skin color, political stance’ and all other factors that comprises one’s identity » Criteria for social minority vs. Who will be acknowledged as social minority 18
  • 20. Low-resource perspective • Creating a hate speech corpus from scratch  Step 2: What should we define first? Hate speech as bias detection and toxicity measurement • Set up criteria for bias detection – `People with a specific characteristic may behave in some way’ – Differs from the judgment » Gender-related bias » Other biases » None 19 Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments, 2021.
  • 21. Low-resource perspective • Creating a hate speech corpus from scratch  Step 2: What should we define first? Hate speech as bias detection and toxicity measurement • Set up criteria for toxicity measurement – Hate » Hostility towards a specific group or individual » Can be represented by some profanity terms, but terms do not imply hate – Insult » Expressions that can harm the prestige of individuals or group » Various profanity terms are included – Offensive expressions » Does not count as hate or insult, but may make the readers offensive » Includes sarcasm, irony, bad guessing, unethical expressions 20
  • 22. Low-resource perspective • Creating a hate speech corpus from scratch  Step 2: What should we define first? Hate speech as bias detection and toxicity measurement • Set up criteria for toxicity measurement » Severe hate or insult » Not hateful but offensive or sarcastic » None 21 Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments, 2021.
  • 23. Low-resource perspective • Creating a hate speech corpus from scratch  Step 3: What is required for the annotation? Building a guideline for data annotation • Stakeholders – Researchers – Moderators (crowdsourcing platform) – Workers • How is guideline used as? – Setting up research direction (for researchers) – Task understanding (for moderators) – Data annotation (for workers) 22
  • 24. Low-resource perspective • Creating a hate speech corpus from scratch  Step 3: What is required for the annotation? Building a guideline for data annotation • Guideline is not built at once! – Usual process » Making up draft guideline based on source corpus » Pilot study of researchers & guideline update (𝑁 times iteration) » Moderators’ and researchers’ alignment on the guideline » Worker recruitment & pilot tagging » Guideline update with worker feedback (cautions & exceptions) » Final guideline (for main annotation) 23
  • 25. Low-resource perspective • Creating a hate speech corpus from scratch  Step 3: What is required for the annotation? Building a guideline for data annotation • Draft guideline – Built based upon a small portion of source corpus (about hundreds of instances) – Researchers’ intuition is highly involved in – Concept-based description » e.g., for `bias’, `People with a specific characteristic may behave in some way’ (instead of listing up all stereotyped expressions) • Pilot study – Researchers’ tagging on slightly larger portion of source corpus (~1K instances) – Fitting researchers’ intuition on the proposed concepts » e.g., ``Does this expression contain bias or toxicity?’’ (discussion is important, but don’t fight!) – Update descriptions or add examples – Labeling, re-labeling, re-re-labeling... 24
  • 26. Low-resource perspective • Creating a hate speech corpus from scratch  Step 3: What is required for the annotation? Building a guideline for data annotation • Pilot study – Labeling, re-labeling, re-re-labeling... + Agreement? – Inter-annotator agreement (IAA) » Calculating the reliability of annotation » Cohen’s Kappa for two annotators » Fleiss’ Kappa for more than two annotators – Sufficiently high agreement? (> 0.6?) » Let’s go annotating in the wild! 25 Pustejovsky and Stubbs, Natural Language Annotation, 2012.
  • 27. Low-resource perspective • Creating a hate speech corpus from scratch  Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement • Finding a crowdsourcing platform – Moderator » Usually an expert in data creation and management » Comprehends the task, gives feedback in view of workers » Helps communication between researchers and workers » Instructs, and sometimes hurries workers to meet the timeline » Manages financial or legal issues » Let researchers concentrate on the task itself – Without moderator? » Researchers are the moderator! (Unless there are some automated functions in the platform) – With moderator? » The closest partner of researchers 26
  • 28. Low-resource perspective • Creating a hate speech corpus from scratch  Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement • Finding a crowdsourcing platform – Existence and experience of the moderator » Experience of similar dataset construction » Comprehension of the task & proper feedbacks » Sufficient worker pool » Trust between the moderator and workers – Reasonable cost estimation » Appropriateness of price per tagging or reviewing » Appropriateness of worker compensation » Fit with the budget 27
  • 29. Low-resource perspective • Creating a hate speech corpus from scratch  Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement • Finding a crowdsourcing platform – Usefulness of the platform UI » Progress status (In progress, Submitted, Waiting for reviewing... etc.) » Statistics: The number of workers and reviewers, Average work/review duration... » Demographics, Worker history by individuals & in total... 28
  • 30. Low-resource perspective • Creating a hate speech corpus from scratch  Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement • Pilot tagging (by workers) – Goal of worker pilot » Guideline update in workers’ view (especially on cautions & exceptions) » Worker selection – Procedure » Advertisement or recruitment » Worker tagging » Researchers’ (or moderators’) review & rejection » Workers’ revise & resubmit 29
  • 31. Low-resource perspective • Creating a hate speech corpus from scratch  Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement • Details on worker selection process? – Human checking » Ethical standard not too far from the guideline? » Is feedback effective for the rejected samples? – Automatic checking » Enough taggings done? » Too frequent cases of skipping the annotation? 30 UI screenshots provided by Deep Natural AI.
  • 32. Low-resource perspective • Creating a hate speech corpus from scratch  Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement • Crowdsourcing: A simplified version is required for crowd annotation! – Multi-class, multi-attribute tagging » 3 classes for bias » 3 classes for toxicity – Given a comment (without context), the annotator should tag each attribute – Detailed guideline (with examples, cautions, and exceptions) is provided separately 31 1. What kind of bias does the comment contain? - Gender bias, Other biases, or None 2. Which is the adequate category for the comment in terms of toxicity? - Hate, Offensive, or None
  • 33. Low-resource perspective • Creating a hate speech corpus from scratch  Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement • Main annotation – Based on the final version of the guideline » 3~5 annotators (per sample) for usual classification tasks – Tagging done by selected workers » Worker selection and education » Short quiz (if workers are not selected) – Annotation toolkit » Assign samples randomly to workers, with multiple annotators per sample » Interface developed or provided by the platform (usually takes budget) » Open-source interfaces (e.g., Labelstudio) – Data check for further guarantee of quality » If sufficiently many annotators per sample? » If not...? 32
  • 34. Low-resource perspective • Creating a hate speech corpus from scratch  Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement • Data selection after main annotation (8,000 samples) – Data reviewing strategy may differ by subtask – Researchers decide the final label after adjudication – Common for bias and toxicity » Cases with all three annotators differ – Only for toxicity » Since the problem regards the continuum of degree, cases with only hate (o) and none (x) need to be investigated again – Failure for decision (unable to majority vote) – discarded 33
  • 35. Low-resource perspective • Creating a hate speech corpus from scratch  Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement • Final decision – Test: 974 » Data tagged while constructing the guideline (Mostly adjust to the intention of the guideline) – Validation: 471 » Data which went through tag/review/reject and accept in the pilot phase, done with a large number of annotators (Roughly aligned with the guideline) – Train: 7,896 » Data which were crowd-sourced with the selected workers, not reviewed totally but went through adjudication only for some special cases • Agreement – 0.492 for bias detection, 0.496 for toxicity measurement 34 Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020.
  • 36. Low-resource perspective • Creating a hate speech corpus from scratch  Step 4: How is the annotation process conducted and evaluated? Beyond creation - Model training and deployment • Model training – Traditionally » High performance – relatively easy? » Low performance – relatively challenging? – But in PLM-based training these days... » Pretraining corpora » Model size » Model architecture – Model deployment » Performance & size » User feedbacks 35 Yang, Transformer-based Korean Pretrained Language Models: A Survey on Three Years of Progress, 2021.
  • 37. Challenges • Challenges of hate speech corpus construction  Context-dependency • News comment – articles • Tweets – threads • Web community comments – posts  Multi-modal or noisy inputs • Image and audio – Kiela et al. (2020) - Hateful memes challenge • Perturbated texts – Cho and Kim (2021) - Leetcodes - Yaminjeongeum 36 Kiela et al., The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes, 2020. Cho and Kim, Google-trickers, Yaminjeongeum, and Leetspeak: An Empirical Taxonomy for Intentionally Noisy User-Generated Text, 2021.
  • 38. Challenges • Challenges of hate speech corpus construction  Categorical or binary output has limitation • Limitation of categorizing the degree of intensity – Hate/offensive/none categorization is sub-optimal – Polleto et al. (2019) Scale-based annotation: Unbalanced Rating Scale » Used to determine the label (or used as a target score?) 37 Poletto et al., Annotating Hate Speech: Three Schemes at Comparison, 2019.
  • 39. Challenges • Challenges of hate speech corpus construction  Annotation requires multiple label • Aspect of discrimination may differ by attributes – Gender, Race, Nationality, Ageism ... • Tagging `all the target attributes’ that appear? – Kang et al. (2022) » Detailed guideline with terms and concepts defined for each atttribute 38 Women & family Male Sexual minorities Race & nationality Ageism Regionalism Religion Other Malicious None S1 1 0 0 0 1 0 0 0 0 0 S2 0 0 0 0 0 0 0 0 1 0 S3 0 0 0 1 0 0 1 0 0 0 S4 0 0 0 0 0 0 0 0 0 1 Kang et al., Korean Online Hate Speech Dataset for Multilabel Classification - How Can Social Science Improve Dataset on Hate Speech?, 2022.
  • 40. Challenges • Challenges of hate speech corpus construction  Privacy and license issues • Privacy and license can be violated with the text crawling • Hate speech corpus may contain personal information on (public) figures • Text could have been brought from elsewhere (copy & paste)  How about creating hate (and non-hate) speech from scratch? • Yang et al. (2022): Recruit workers and enable `anonymous’ text generation! 39 Yang et al., APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets, 2022.
  • 41. Challenges • Ambiguity is inevitable  Text may incorporate various ways of interpretation • Text accompanies omission or replacement to trick the monitoring • Intention is apparent considering the context • Temporal diachronicity of hate speech  Non-hate speech in the past can be interpreted as hate speech these days  Diachronicity may deter the utility of prediction systems • e.g., [a name of celebrity who commited crime] before 20xx / after 20xx • Boundary of hate speech and freedom of speech  Grey area that cannot be resolved • Some readers are offended by false positives • Some users are offended by false negatives 40
  • 42. Conclusion • Hate speech prevalent in real and cyber spaces  Discussions on hate speech have diverse viewpoints, from academia to society and industry – and they are reflected to the dataset construction • No corpus is built perfect from the beginning  ... and hate speech is one of the most difficult kind of corpus to create • Considerations in low-resource hate speech corpus construction  Why? How? How much? How well? • Still more challenges left  Context, input noise, output format, indecisiveness ... • Takeaways  There is discrepancy between theoretical and practical definition of hate speech, and their aim may differ  There is no hate speech detection guideline that satisfies ALL, so let’s find the boundary that satisfies the most and improve it 41
  • 43. Reference • Waseem and Hovy, Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter, 2016. • Davidson et al., Automated Hate Speech Detection and the Problem of Offensive Language, 2017. • Sanguinetti et al., An Italian Twitter Corpus of Hate Speech against Immigrants, 2018. • Assimakopoulos et al., Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis, 2020. • Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020. • Hong et al., Study on the State and Regulation of Hate Speech, 2016. • Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments, 2021. • Pustejovsky and Stubbs, Natural Language Annotation, 2012. • Yang, Transformer-based Korean Pretrained Language Models: A Survey on Three Years of Progress, 2021. • Kiela et al., The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes, 2020. • Cho and Kim, Google-trickers, Yaminjeongeum, and Leetspeak: An Empirical Taxonomy for Intentionally Noisy User-Generated Text, 2021. • Poletto et al., Annotating Hate Speech: Three Schemes at Comparison, 2019. • Kang et al., Korean Online Hate Speech Dataset for Multilabel Classification - How Can Social Science Improve Dataset on Hate Speech?, 2022. • Yang et al., APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets, 2022. 42