SlideShare a Scribd company logo
1 of 44
Building a Dataset to Measure
Toxicity and Social Bias within Language:
A Low-Resource Perspective
Won Ik Cho (SNU ECE)
2022. 6. 22 @FAccT, Seoul, Korea
Introduction
ā€¢ CHO, Won Ik (ģ”°ģ›ģµ)
ļ‚§ B.S. in EE/Mathematics (SNU, ā€™10~ā€™14)
ļ‚§ Ph.D. student (SNU ECE, ā€™14~)
ā€¢ Academic interests
ļ‚§ Built Korean NLP datasets on various
spoken language understanding areas
ļ‚§ Currently interested in computational
approaches of:
ā€¢ Dialogue analysis
ā€¢ AI for social good
1
Contents
ā€¢ Introduction
ā€¢ Hate speech in real and cyber spaces
ļ‚§ What is hate speech and why does it matter?
ļ‚§ Study on hate speech detection
ā€¢ In English ā€“ Dataset and analysis
ā€¢ Notable approaches in other languages
ā€¢ Low-resource perspective: Creating a hate speech corpus from
scratch
ļ‚§ Analysis on existing language resources
ļ‚§ Hate speech as bias detection and toxicity measurement
ļ‚§ Building a guideline for data annotation
ļ‚§ Worker pilot, crowdsourcing, and agreement
ā€¢ Challenges of hate speech corpus construction
ā€¢ Conclusion
2
Contents
Caution! This presenation may contain contents that can be offensive to
certain groups of people, such as gender bias, racism, or other
unethical contents including multimodal materials
3
Contents
ā€¢ Handled in this tutorial
ļ‚§ How to build up a hate speech detection dataset in a specific setting
(language, text domain, etc.)
ļ‚§ How to check the validity of the created hate speech corpus
ā€¢ Less handled in this tutorial
ļ‚§ Comprehensive definition of hate speech and social bias in the literature
ļ‚§ Reliability of specific ethical guideline for hate speech corpus construction
4
Hate speech in real and cyber spaces
ā€¢ What is hate speech and why does it matter?
ļ‚§ Difficulty of defining hate speech
ā€¢ Political and legal term, and not just a theoretical term
ā€¢ Has no unified/universal definition accepted to all
ā€¢ Definition differs upon language, culture, domain, discipline, etc.
ļ‚§ Definition given by United Nations
ā€¢ ā€œAny kind of communication in speech, writing or behaviour, that attacks or
uses pejorative or discriminatory language with reference to a person or a
group on the basis of who they are, in other words, based on their religion,
ethnicity, nationality, race, colour, descent, gender or other identity factor.ā€
ā€“ Not a legal definition
ā€“ Broader than the notion of ā€œincitement to discrimination, hostility or violenceā€
prohibited under international human rights law
5
https://www.un.org/en/hate-speech/understanding-hate-speech/what-is-hate-speech?
Hate speech in real and cyber spaces
ā€¢ What is hate speech and why does it matter?
ļ‚§ Hate speech in cyber spaces
ā€¢ Definition is deductive, but its detection is inductive
ā€¢ Hate speech appears online as various expressions, including:
ā€“ Offensive language
ā€“ Pejorative expressions
ā€“ Discriminative words
ā€“ Profanity terms
ā€“ Insulting ... etc.
ā€¢ Whether to include specific terms or expressions in the category of `hate
speechā€™ is a tricky issue
ā€“ What if pejorative expression or profanity term does not target any group or
individuals?
ā€“ What if a (sexual) harrassment is considered offensive to readers but not for the
target figure?
6
Hate speech in real and cyber spaces
ā€¢ Discussion on hate speech detection
ļ‚§ Studies for English
ā€¢ Waseem and Hovy (2016)
ā€“ Annotates tweets upon around 10 features that make the post offensive
7
A tweet is offensive if it
1. uses a sexist or racial slur.
2. attacks a minority.
3. seeks to silence a minority.
4. criticizes a minority (without a well founded argument).
5. promotes, but does not directly use, hate speech or violent crime.
6. criticizes a minority and uses a straw man argument.
7. blatantly misrepresents truth or seeks to distort views on a minority with unfounded claims.
8. shows support of problematic hash tags. E.g. ā€œ#BanIslamā€, ā€œ#whorientalā€, ā€œ#whitegenocideā€
9. negatively stereotypes a minority.
10. defends xenophobia or sexism.
11. contains a screen name that is offensive, as per the previous criteria, the tweet is
ambiguous (at best), and the tweet is on a topic that satisfies any of the above criteria
Waseem and Hovy, Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter, 2016.
Hate speech in real and cyber spaces
ā€¢ Discussion on hate speech detection
ļ‚§ Studies for English
ā€¢ Davidson et al. (2017)
ā€“ Mentions the discrepancy between the theoretical definition and real world
expressions of hate speech
ā€“ Puts `offensiveā€™ expressions in between `hateā€™ and `non-hateā€™, to incorporate the
expressions that are in the grey area
ā€“ Incorporate profanity terms used
prevalent in social media, which
does not necessarily targets minority
but induces offensiveness
8
Davidson et al., Automated Hate Speech Detection and the Problem of Offensive Language, 2017.
Hate speech in real and cyber spaces
ā€¢ Discussion on hate speech detection
ļ‚§ Notable approaches in other languages
ā€¢ Sanguinetti et al. (2018)
ā€“ Investigates hate speech for the posts on Italian immigrants
ā€“ Beyond hate speech, tags if the post is offensive, aggressive, intensive, has irony and
sarcasm, shows stereotype
ā€“ `Stereotypeā€™ as a factor that can be a clue to discrimination
9
Sanguinetti et al., An Italian Twitter Corpus of Hate Speech against Immigrants, 2018.
ā€¢ hate speech: no - yes
ā€¢ aggressiveness: no - weak ā€“ strong
ā€¢ offensiveness: no - weak - strong
ā€¢ irony: no - yes
ā€¢ stereotype: no - yes
ā€¢ intensity: 0 - 1 - 2 - 3 - 4
Hate speech in real and cyber spaces
ā€¢ Discussion on hate speech detection
ļ‚§ Notable approaches in other languages
ā€¢ Assimakopoulos et al. (2020)
ā€“ Motivated by the critical analysis of posts made in reaction to news reports on the
Mediterranean migration crisis and LGBTIQ+ matters in Malta
ā€“ Annotates Malta web texts
ā€“ Investigates the attitude (positive/negative) of the text, and asks for target if negative,
also asking the way the negativeness is conveyed
10
1. Does the post communicate a positive, negative or neutral attitude? [Positive / Negative / Neutral]
2. If negative, who does this attitude target? [Individual / Group]
ā€¢ (a) If it targets an individual, does it do so because of the individualā€™s affiliation to a group? [Yes / No]
If yes, name the group.
ā€¢ (b) If it targets a group, name the group.
3. How is the attitude expressed in relation to the target group? Select all that apply.
[ Derogatory term / Generalisation / Insult / Sarcasm (including jokes and trolling) / Stereotyping /
Suggestion / Threat ]
4. If the post involves a suggestion, is it a suggestion that calls for violence against the target group? [Yes / No]
Assimakopoulos et al., Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis, 2020.
Hate speech in real and cyber spaces
ā€¢ Discussion on hate speech detection
ļ‚§ Notable approaches in other languages
ā€¢ Moon et al. (2020)
ā€“ Annotation on Korean celebrity news comments
ā€“ Investigate the existence of social bias and the degree of toxicity
Ā» Social bias ā€“ Gender-related bias and other biases
Ā» Toxicity ā€“ Hate/Offensive/None (following Davidson et al. 2017)
11
Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020.
Detecting social bias
ā€¢ Is there a gender-related bias, either explicit or implicit, in the text?
ā€¢ Are there any other kinds of bias in the text?
ā€¢ A comment that does not incorporate the bias
Measuring toxicity
ā€¢ Is strong hate or insulting towards the articleā€™s target or related
figures, writers of the article or comments, etc. displayed in a
comment?
ā€¢ Although a comment is not as much hateful or insulting as the
above, does it make the target or the reader feel offended?
ā€¢ A comment that does not incorporate any hatred or insulting
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ ASSUMPTION: There is no manually created hate speech detection corpus
so far for the Korean language (was true before July 2020...)
ā€¢ Generally, clear motivation is required for hate speech corpus construction
ā€“ Why?
Ā» Takes resources (time and money)
Ā» Potential mental harm
Ā» Potential attack towards the researchers
ā€“ Nonetheless, it is required in some circumstances
Ā» Detecting offensive language in services
Ā» Severe harm has been displayed publicly
12
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 1: Is there anything available?
Analysis on existing language resources
ā€¢ Language resources on hate speech detection regards various other similar
datasets (though slightly different in definition and goal)
ā€“ Dictionary of profanity terms (e.g., hatebase.org)
ā€“ Sarcasm detection dataset
ā€“ Sentiment analysis dataset
ā€“ Offensive language detection dataset
ā€¢ Why we should search existing resources?
ā€“ To lessen the consumption of time and money
ā€“ To make the problem easier by building upon existing dataset
ā€“ To confirm what we should aim by creating a new dataset
13
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 1: Is there anything available?
Analysis on existing language resources
ā€¢ Dictionary of profanity terms
ā€“ e.g., https://github.com/doublems/korean-bad-words
ā€¢ Sarcasm detection dataset
ā€“ e.g, https://github.com/SpellOnYou/korean-sarcasm
ā€¢ Sentiment analysis dataset
ā€“ e.g., https://github.com/e9t/nsmc
ļ‚§ The datasets may not completely overlap with hate speech corpus, but at
least they can be a good source of annotation ļŠ
ā€¢ Here, one should think of:
ā€“ Text style
ā€“ Text domain
ā€“ Appearing types of toxicity and bias
14
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 1: Is there anything available?
Analysis on existing language resources
ā€¢ Text style
ā€“ Written/spoken/web text?
ā€¢ Text domain
ā€“ News/wiki/tweets/chat/comments?
ā€¢ Appearing types of toxicity and bias
ā€“ Gender-related?
ā€“ Politics/religion?
ā€“ Region/nationality/ethnicity?
ā€¢ Appearing amount of toxicity and bias
15
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 1: Is there anything available?
Analysis on existing language resources
ā€¢ Data collection example (BEEP!)
ā€“ Comments from the most popular Korean entertainment news platform
Ā» Jan. 2018 ~ Feb. 2020
Ā» 10,403,368 comments from 23,700 articles
Ā» 1,580 articles acquired by stratified sampling
Ā» Top 20 comments in the order of Wilson score on the downvote for each article
ā€“ Filter the duplicates and leave comments having more than single token and less
than 100 characters
ā€“ 10K comments were selected
ā€¢ Data sampling process matters much in the final distribution of the dataset!
16
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
ā€¢ Local definition of hate speech discussed by Korean sociolinguistics society
ā€“ Definition of hate speech
Ā» Expressions that discriminate/hate or incite discrimination/hatred/violence
towards some individual or group of people because they have characteristics
as a social minority
ā€“ Types of hate speech
Ā» Discriminative bullying
Ā» Discrimination
Ā» Public insult/threatening
Ā» Inciting hatred
17
Hong et al., Study on the State and Regulation of Hate Speech, 2016.
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
ā€¢ Set up criteria
ā€“ Analyze ā€˜Discriminate/hate or incite discrimination/hatred/violenceā€™ as a combination
of ā€˜Social biasā€™ and ā€™Toxicityā€™
ā€“ Further discussion required on social minority
Ā» `Gender, age, profession, religion, nationality, skin color, political stanceā€™ and all
other factors that comprises oneā€™s identity
Ā» Criteria for social minority vs. Who will be acknowledged as social minority
18
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
ā€¢ Set up criteria for bias detection
ā€“ `People with a specific characteristic may behave in some wayā€™
ā€“ Differs from the judgment
Ā» Gender-related bias
Ā» Other biases
Ā» None
19
Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments, 2021.
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
ā€¢ Set up criteria for toxicity measurement
ā€“ Hate
Ā» Hostility towards a specific group or individual
Ā» Can be represented by some profanity terms, but terms do not imply hate
ā€“ Insult
Ā» Expressions that can harm the prestige of individuals or group
Ā» Various profanity terms are included
ā€“ Offensive expressions
Ā» Does not count as hate or insult, but may make the readers offensive
Ā» Includes sarcasm, irony, bad guessing, unethical expressions
20
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
ā€¢ Set up criteria for toxicity measurement
Ā» Severe hate or insult
Ā» Not hateful but offensive or sarcastic
Ā» None
21
Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments, 2021.
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 3: What is required for the annotation?
Building a guideline for data annotation
ā€¢ Stakeholders
ā€“ Researchers
ā€“ Moderators (crowdsourcing platform)
ā€“ Workers
ā€¢ How is guideline used as?
ā€“ Setting up research direction (for researchers)
ā€“ Task understanding (for moderators)
ā€“ Data annotation (for workers)
22
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 3: What is required for the annotation?
Building a guideline for data annotation
ā€¢ Guideline is not built at once!
ā€“ Usual process
Ā» Making up draft guideline based on source corpus
Ā» Pilot study of researchers & guideline update (š‘ times iteration)
Ā» Moderatorsā€™ and researchersā€™ alignment on the guideline
Ā» Worker recruitment & pilot tagging
Ā» Guideline update with worker feedback (cautions & exceptions)
Ā» Final guideline (for main annotation)
23
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 3: What is required for the annotation?
Building a guideline for data annotation
ā€¢ Draft guideline
ā€“ Built based upon a small portion of source corpus (about hundreds of instances)
ā€“ Researchersā€™ intuition is highly involved in
ā€“ Concept-based description
Ā» e.g., for `biasā€™,
`People with a specific characteristic may behave in some wayā€™
(instead of listing up all stereotyped expressions)
ā€¢ Pilot study
ā€“ Researchersā€™ tagging on slightly larger portion of source corpus (~1K instances)
ā€“ Fitting researchersā€™ intuition on the proposed concepts
Ā» e.g., ``Does this expression contain bias or toxicity?ā€™ā€™
(discussion is important, but donā€™t fight!)
ā€“ Update descriptions or add examples
ā€“ Labeling, re-labeling, re-re-labeling...
24
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 3: What is required for the annotation?
Building a guideline for data annotation
ā€¢ Pilot study
ā€“ Labeling, re-labeling, re-re-labeling... + Agreement?
ā€“ Inter-annotator agreement (IAA)
Ā» Calculating the reliability of annotation
Ā» Cohenā€™s Kappa for two annotators
Ā» Fleissā€™ Kappa for more than two annotators
ā€“ Sufficiently high agreement? (> 0.6?)
Ā» Letā€™s go annotating in the wild!
25
Pustejovsky and Stubbs, Natural Language Annotation, 2012.
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā€¢ Finding a crowdsourcing platform
ā€“ Moderator
Ā» Usually an expert in data creation and management
Ā» Comprehends the task, gives feedback in view of workers
Ā» Helps communication between researchers and workers
Ā» Instructs, and sometimes hurries workers to meet the timeline
Ā» Manages financial or legal issues
Ā» Let researchers concentrate on the task itself
ā€“ Without moderator?
Ā» Researchers are the moderator!
(Unless there are some automated functions in the platform)
ā€“ With moderator?
Ā» The closest partner of researchers
26
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā€¢ Finding a crowdsourcing platform
ā€“ Existence and experience of the moderator
Ā» Experience of similar dataset construction
Ā» Comprehension of the task & proper feedbacks
Ā» Sufficient worker pool
Ā» Trust between the moderator and workers
ā€“ Reasonable cost estimation
Ā» Appropriateness of price per tagging or reviewing
Ā» Appropriateness of worker compensation
Ā» Fit with the budget
27
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā€¢ Finding a crowdsourcing platform
ā€“ Usefulness of the platform UI
Ā» Progress status (In progress, Submitted, Waiting for reviewing... etc.)
Ā» Statistics: The number of workers and reviewers, Average work/review duration...
Ā» Demographics, Worker history by individuals & in total...
28
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā€¢ Pilot tagging (by workers)
ā€“ Goal of worker pilot
Ā» Guideline update in workersā€™ view (especially on cautions & exceptions)
Ā» Worker selection
ā€“ Procedure
Ā» Advertisement or recruitment
Ā» Worker tagging
Ā» Researchersā€™ (or moderatorsā€™) review & rejection
Ā» Workersā€™ revise & resubmit
29
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā€¢ Details on worker selection process?
ā€“ Human checking
Ā» Ethical standard not too far from the guideline?
Ā» Is feedback effective for the rejected samples?
ā€“ Automatic checking
Ā» Enough taggings done?
Ā» Too frequent cases of skipping the annotation?
30
UI screenshots provided by Deep Natural AI.
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā€¢ Crowdsourcing: A simplified version is required for crowd annotation!
ā€“ Multi-class, multi-attribute tagging
Ā» 3 classes for bias
Ā» 3 classes for toxicity
ā€“ Given a comment (without context), the annotator should tag each attribute
ā€“ Detailed guideline (with examples, cautions, and exceptions) is provided separately
31
1. What kind of bias does the comment contain?
- Gender bias, Other biases, or None
2. Which is the adequate category for the comment in terms of toxicity?
- Hate, Offensive, or None
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā€¢ Main annotation
ā€“ Based on the final version of the guideline
Ā» 3~5 annotators (per sample) for usual classification tasks
ā€“ Tagging done by selected workers
Ā» Worker selection and education
Ā» Short quiz (if workers are not selected)
ā€“ Annotation toolkit
Ā» Assign samples randomly to workers, with multiple annotators per sample
Ā» Interface developed or provided by the platform (usually takes budget)
Ā» Open-source interfaces (e.g., Labelstudio)
ā€“ Data check for further guarantee of quality
Ā» If sufficiently many annotators per sample?
Ā» If not...?
32
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā€¢ Data selection after main annotation (8,000 samples)
ā€“ Data reviewing strategy may differ by subtask
ā€“ Researchers decide the final label after adjudication
ā€“ Common for bias and toxicity
Ā» Cases with all three annotators differ
ā€“ Only for toxicity
Ā» Since the problem regards the continuum of degree,
cases with only hate (o) and none (x) need to be investigated again
ā€“ Failure for decision (unable to majority vote) ā€“ discarded
33
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā€¢ Final decision
ā€“ Test: 974
Ā» Data tagged while constructing the guideline
(Mostly adjust to the intention of the guideline)
ā€“ Validation: 471
Ā» Data which went through tag/review/reject
and accept in the pilot phase,
done with a large number of annotators
(Roughly aligned with the guideline)
ā€“ Train: 7,896
Ā» Data which were crowd-sourced with the
selected workers, not reviewed totally but
went through adjudication only for some special cases
ā€¢ Agreement
ā€“ 0.492 for bias detection, 0.496 for toxicity measurement
34
Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020.
Low-resource perspective
ā€¢ Creating a hate speech corpus from scratch
ļ‚§ Step 4: How is the annotation process conducted and evaluated?
Beyond creation - Model training and deployment
ā€¢ Model training
ā€“ Traditionally
Ā» High performance ā€“ relatively easy?
Ā» Low performance ā€“ relatively challenging?
ā€“ But in PLM-based training these days...
Ā» Pretraining corpora
Ā» Model size
Ā» Model architecture
ā€“ Model deployment
Ā» Performance & size
Ā» User feedbacks
35
Yang, Transformer-based Korean Pretrained Language Models: A Survey on Three Years of Progress, 2021.
Challenges
ā€¢ Challenges of hate speech corpus construction
ļ‚§ Context-dependency
ā€¢ News comment ā€“ articles
ā€¢ Tweets ā€“ threads
ā€¢ Web community comments ā€“ posts
ļ‚§ Multi-modal or noisy inputs
ā€¢ Image and audio
ā€“ Kiela et al. (2020)
- Hateful memes challenge
ā€¢ Perturbated texts
ā€“ Cho and Kim
(2021)
- Leetcodes
- Yaminjeongeum
36
Kiela et al., The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes, 2020.
Cho and Kim, Google-trickers, Yaminjeongeum, and Leetspeak: An Empirical Taxonomy for Intentionally Noisy User-Generated Text, 2021.
Challenges
ā€¢ Challenges of hate speech corpus construction
ļ‚§ Categorical or binary output has limitation
ā€¢ Limitation of categorizing the degree of intensity
ā€“ Hate/offensive/none categorization is sub-optimal
ā€“ Polleto et al. (2019)
Scale-based annotation:
Unbalanced Rating Scale
Ā» Used to determine the label
(or used as a target score?)
37
Poletto et al., Annotating Hate Speech: Three Schemes at Comparison, 2019.
Challenges
ā€¢ Challenges of hate speech corpus construction
ļ‚§ Annotation requires multiple label
ā€¢ Aspect of discrimination may differ by attributes
ā€“ Gender, Race, Nationality, Ageism ...
ā€¢ Tagging `all the target attributesā€™ that appear?
ā€“ Kang et al. (2022)
Ā» Detailed guideline with terms and concepts defined for each atttribute
38
Women
& family
Male Sexual
minorities
Race &
nationality
Ageism Regionalism Religion Other Malicious None
S1 1 0 0 0 1 0 0 0 0 0
S2 0 0 0 0 0 0 0 0 1 0
S3 0 0 0 1 0 0 1 0 0 0
S4 0 0 0 0 0 0 0 0 0 1
Kang et al., Korean Online Hate Speech Dataset for Multilabel Classification - How Can Social Science Improve Dataset on Hate Speech?, 2022.
Challenges
ā€¢ Challenges of hate speech corpus construction
ļ‚§ Privacy and license issues
ā€¢ Privacy and license can be violated with the text crawling
ā€¢ Hate speech corpus may contain personal information on (public) figures
ā€¢ Text could have been brought from elsewhere (copy & paste)
ļ‚§ How about creating hate (and non-hate) speech from scratch?
ā€¢ Yang et al. (2022): Recruit workers and enable `anonymousā€™ text generation!
39
Yang et al., APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets, 2022.
Challenges
ā€¢ Ambiguity is inevitable
ļ‚§ Text may incorporate various ways of interpretation
ā€¢ Text accompanies omission or replacement to trick the monitoring
ā€¢ Intention is apparent considering the context
ā€¢ Temporal diachronicity of hate speech
ļ‚§ Non-hate speech in the past can be interpreted as hate speech these days
ļ‚§ Diachronicity may deter the utility of prediction systems
ā€¢ e.g., [a name of celebrity who commited crime] before 20xx / after 20xx
ā€¢ Boundary of hate speech and freedom of speech
ļ‚§ Grey area that cannot be resolved
ā€¢ Some readers are offended by false positives
ā€¢ Some users are offended by false negatives
40
Conclusion
ā€¢ Hate speech prevalent in real and cyber spaces
ļ‚§ Discussions on hate speech have diverse viewpoints, from academia to
society and industry ā€“ and they are reflected to the dataset construction
ā€¢ No corpus is built perfect from the beginning
ļ‚§ ... and hate speech is one of the most difficult kind of corpus to create
ā€¢ Considerations in low-resource hate speech corpus construction
ļ‚§ Why? How? How much? How well?
ā€¢ Still more challenges left
ļ‚§ Context, input noise, output format, indecisiveness ...
ā€¢ Takeaways
ļ‚§ There is discrepancy between theoretical and practical definition of hate
speech, and their aim may differ
ļ‚§ There is no hate speech detection guideline that satisfies ALL, so letā€™s find
the boundary that satisfies the most and improve it
41
Reference
ā€¢ Waseem and Hovy, Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter, 2016.
ā€¢ Davidson et al., Automated Hate Speech Detection and the Problem of Offensive Language, 2017.
ā€¢ Sanguinetti et al., An Italian Twitter Corpus of Hate Speech against Immigrants, 2018.
ā€¢ Assimakopoulos et al., Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis, 2020.
ā€¢ Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020.
ā€¢ Hong et al., Study on the State and Regulation of Hate Speech, 2016.
ā€¢ Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News
Comments, 2021.
ā€¢ Pustejovsky and Stubbs, Natural Language Annotation, 2012.
ā€¢ Yang, Transformer-based Korean Pretrained Language Models: A Survey on Three Years of Progress, 2021.
ā€¢ Kiela et al., The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes, 2020.
ā€¢ Cho and Kim, Google-trickers, Yaminjeongeum, and Leetspeak: An Empirical Taxonomy for Intentionally Noisy User-Generated
Text, 2021.
ā€¢ Poletto et al., Annotating Hate Speech: Three Schemes at Comparison, 2019.
ā€¢ Kang et al., Korean Online Hate Speech Dataset for Multilabel Classification - How Can Social Science Improve Dataset on Hate
Speech?, 2022.
ā€¢ Yang et al., APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets, 2022.
42
Thank you!
EndOfPresentation

More Related Content

Similar to 2206 FAccT_inperson

Week 9 145
Week 9 145 Week 9 145
Week 9 145 lisyaseloni
Ā 
Approaches to studying language attitudes beyond labov
Approaches to studying language attitudes  beyond labovApproaches to studying language attitudes  beyond labov
Approaches to studying language attitudes beyond labovJacqueline Trademan
Ā 
Week 11 english 145
Week 11 english 145 Week 11 english 145
Week 11 english 145 lisyaseloni
Ā 
Introduction cda pid2012
Introduction cda pid2012Introduction cda pid2012
Introduction cda pid2012Francesca Helm
Ā 
Chp 5 ethical public speaking
Chp 5 ethical public speakingChp 5 ethical public speaking
Chp 5 ethical public speakingProfessorEvans
Ā 
Chp 5 ethical public speaking
Chp 5 ethical public speakingChp 5 ethical public speaking
Chp 5 ethical public speakingProfessorEvans
Ā 
Discourse Analysis - Project Instructions
Discourse Analysis - Project InstructionsDiscourse Analysis - Project Instructions
Discourse Analysis - Project InstructionsAshwag Al Hamid
Ā 
Resources_Guide
Resources_GuideResources_Guide
Resources_GuideSajid Butt
Ā 
Liking violence: A study of hate speech on Facebook in Sri Lanka
Liking violence: A study of hate speech on Facebook in Sri LankaLiking violence: A study of hate speech on Facebook in Sri Lanka
Liking violence: A study of hate speech on Facebook in Sri LankaSanjana Hattotuwa
Ā 
Discourse communities -authorityanddata-1
Discourse communities -authorityanddata-1Discourse communities -authorityanddata-1
Discourse communities -authorityanddata-1Laura Martinez
Ā 
4. audience reception
4. audience reception4. audience reception
4. audience receptionMike Gunn
Ā 
Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010
Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010
Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010TEST Huddle
Ā 
A Survey of ā€˜Biasā€™ in Natural Language Processing Systems
A Survey of ā€˜Biasā€™ in Natural Language Processing SystemsA Survey of ā€˜Biasā€™ in Natural Language Processing Systems
A Survey of ā€˜Biasā€™ in Natural Language Processing Systemssubarna89
Ā 
Audiencetheory
AudiencetheoryAudiencetheory
AudiencetheoryEmma Leslie
Ā 
2 evaluating participatory communication
2 evaluating participatory communication2 evaluating participatory communication
2 evaluating participatory communicationSheeva Dubey
Ā 
Multimedia Academic Literacy
Multimedia Academic LiteracyMultimedia Academic Literacy
Multimedia Academic LiteracySpelman College
Ā 
Week 12 english 145
Week 12 english 145Week 12 english 145
Week 12 english 145lisyaseloni
Ā 
Global education
Global educationGlobal education
Global educationMohammed Fawaz
Ā 
Sentiment Analysis with NVivo 11 Plus
Sentiment Analysis with NVivo 11 PlusSentiment Analysis with NVivo 11 Plus
Sentiment Analysis with NVivo 11 PlusShalin Hai-Jew
Ā 
Learning in the Wild: Coding Reddit for Learning and Practice
Learning in the Wild: Coding Reddit for Learning and PracticeLearning in the Wild: Coding Reddit for Learning and Practice
Learning in the Wild: Coding Reddit for Learning and PracticePriya Kumar
Ā 

Similar to 2206 FAccT_inperson (20)

Week 9 145
Week 9 145 Week 9 145
Week 9 145
Ā 
Approaches to studying language attitudes beyond labov
Approaches to studying language attitudes  beyond labovApproaches to studying language attitudes  beyond labov
Approaches to studying language attitudes beyond labov
Ā 
Week 11 english 145
Week 11 english 145 Week 11 english 145
Week 11 english 145
Ā 
Introduction cda pid2012
Introduction cda pid2012Introduction cda pid2012
Introduction cda pid2012
Ā 
Chp 5 ethical public speaking
Chp 5 ethical public speakingChp 5 ethical public speaking
Chp 5 ethical public speaking
Ā 
Chp 5 ethical public speaking
Chp 5 ethical public speakingChp 5 ethical public speaking
Chp 5 ethical public speaking
Ā 
Discourse Analysis - Project Instructions
Discourse Analysis - Project InstructionsDiscourse Analysis - Project Instructions
Discourse Analysis - Project Instructions
Ā 
Resources_Guide
Resources_GuideResources_Guide
Resources_Guide
Ā 
Liking violence: A study of hate speech on Facebook in Sri Lanka
Liking violence: A study of hate speech on Facebook in Sri LankaLiking violence: A study of hate speech on Facebook in Sri Lanka
Liking violence: A study of hate speech on Facebook in Sri Lanka
Ā 
Discourse communities -authorityanddata-1
Discourse communities -authorityanddata-1Discourse communities -authorityanddata-1
Discourse communities -authorityanddata-1
Ā 
4. audience reception
4. audience reception4. audience reception
4. audience reception
Ā 
Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010
Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010
Susan Windsor - Critical Thinking for Testers - EuroSTAR 2010
Ā 
A Survey of ā€˜Biasā€™ in Natural Language Processing Systems
A Survey of ā€˜Biasā€™ in Natural Language Processing SystemsA Survey of ā€˜Biasā€™ in Natural Language Processing Systems
A Survey of ā€˜Biasā€™ in Natural Language Processing Systems
Ā 
Audiencetheory
AudiencetheoryAudiencetheory
Audiencetheory
Ā 
2 evaluating participatory communication
2 evaluating participatory communication2 evaluating participatory communication
2 evaluating participatory communication
Ā 
Multimedia Academic Literacy
Multimedia Academic LiteracyMultimedia Academic Literacy
Multimedia Academic Literacy
Ā 
Week 12 english 145
Week 12 english 145Week 12 english 145
Week 12 english 145
Ā 
Global education
Global educationGlobal education
Global education
Ā 
Sentiment Analysis with NVivo 11 Plus
Sentiment Analysis with NVivo 11 PlusSentiment Analysis with NVivo 11 Plus
Sentiment Analysis with NVivo 11 Plus
Ā 
Learning in the Wild: Coding Reddit for Learning and Practice
Learning in the Wild: Coding Reddit for Learning and PracticeLearning in the Wild: Coding Reddit for Learning and Practice
Learning in the Wild: Coding Reddit for Learning and Practice
Ā 

More from WarNik Chow

2312 PACLIC
2312 PACLIC2312 PACLIC
2312 PACLICWarNik Chow
Ā 
2311 EAAMO
2311 EAAMO2311 EAAMO
2311 EAAMOWarNik Chow
Ā 
2211 HCOMP
2211 HCOMP2211 HCOMP
2211 HCOMPWarNik Chow
Ā 
2211 APSIPA
2211 APSIPA2211 APSIPA
2211 APSIPAWarNik Chow
Ā 
2206 Modupop!
2206 Modupop!2206 Modupop!
2206 Modupop!WarNik Chow
Ā 
2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech datasetWarNik Chow
Ā 
2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2eWarNik Chow
Ā 
2106 PRSLLS
2106 PRSLLS2106 PRSLLS
2106 PRSLLSWarNik Chow
Ā 
2106 ACM DIS
2106 ACM DIS2106 ACM DIS
2106 ACM DISWarNik Chow
Ā 
2104 Talk @SSU
2104 Talk @SSU2104 Talk @SSU
2104 Talk @SSUWarNik Chow
Ā 
2103 ACM FAccT
2103 ACM FAccT2103 ACM FAccT
2103 ACM FAccTWarNik Chow
Ā 
2102 Redone seminar
2102 Redone seminar2102 Redone seminar
2102 Redone seminarWarNik Chow
Ā 
2011 NLP-OSS
2011 NLP-OSS2011 NLP-OSS
2011 NLP-OSSWarNik Chow
Ā 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH WarNik Chow
Ā 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categoriesWarNik Chow
Ā 
2010 HCLT Hate Speech
2010 HCLT Hate Speech2010 HCLT Hate Speech
2010 HCLT Hate SpeechWarNik Chow
Ā 
2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLPWarNik Chow
Ā 
2008 [lang con2020] act!
2008 [lang con2020] act!2008 [lang con2020] act!
2008 [lang con2020] act!WarNik Chow
Ā 

More from WarNik Chow (20)

2312 PACLIC
2312 PACLIC2312 PACLIC
2312 PACLIC
Ā 
2311 EAAMO
2311 EAAMO2311 EAAMO
2311 EAAMO
Ā 
2211 HCOMP
2211 HCOMP2211 HCOMP
2211 HCOMP
Ā 
2211 APSIPA
2211 APSIPA2211 APSIPA
2211 APSIPA
Ā 
2211 AACL
2211 AACL2211 AACL
2211 AACL
Ā 
2210 CODI
2210 CODI2210 CODI
2210 CODI
Ā 
2206 Modupop!
2206 Modupop!2206 Modupop!
2206 Modupop!
Ā 
2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset2204 Kakao talk on Hate speech dataset
2204 Kakao talk on Hate speech dataset
Ā 
2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e2108 [LangCon2021] kosp2e
2108 [LangCon2021] kosp2e
Ā 
2106 PRSLLS
2106 PRSLLS2106 PRSLLS
2106 PRSLLS
Ā 
2106 ACM DIS
2106 ACM DIS2106 ACM DIS
2106 ACM DIS
Ā 
2104 Talk @SSU
2104 Talk @SSU2104 Talk @SSU
2104 Talk @SSU
Ā 
2103 ACM FAccT
2103 ACM FAccT2103 ACM FAccT
2103 ACM FAccT
Ā 
2102 Redone seminar
2102 Redone seminar2102 Redone seminar
2102 Redone seminar
Ā 
2011 NLP-OSS
2011 NLP-OSS2011 NLP-OSS
2011 NLP-OSS
Ā 
2010 INTERSPEECH
2010 INTERSPEECH 2010 INTERSPEECH
2010 INTERSPEECH
Ā 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
Ā 
2010 HCLT Hate Speech
2010 HCLT Hate Speech2010 HCLT Hate Speech
2010 HCLT Hate Speech
Ā 
2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP2009 DevC Seongnam - NLP
2009 DevC Seongnam - NLP
Ā 
2008 [lang con2020] act!
2008 [lang con2020] act!2008 [lang con2020] act!
2008 [lang con2020] act!
Ā 

Recently uploaded

Film the city investagation powerpoint :)
Film the city investagation powerpoint :)Film the city investagation powerpoint :)
Film the city investagation powerpoint :)AshtonCains
Ā 
Night 7k Call Girls Atta Market Escorts Call Me: 8448380779
Night 7k Call Girls Atta Market Escorts Call Me: 8448380779Night 7k Call Girls Atta Market Escorts Call Me: 8448380779
Night 7k Call Girls Atta Market Escorts Call Me: 8448380779Delhi Call girls
Ā 
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFECASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFECall girl Jaipur
Ā 
This is a Powerpoint about research into the codes and conventions of a film ...
This is a Powerpoint about research into the codes and conventions of a film ...This is a Powerpoint about research into the codes and conventions of a film ...
This is a Powerpoint about research into the codes and conventions of a film ...samuelcoulson30
Ā 
Elite Class āž„8448380779ā–» Call Girls In New Friends Colony Delhi NCR
Elite Class āž„8448380779ā–» Call Girls In New Friends Colony Delhi NCRElite Class āž„8448380779ā–» Call Girls In New Friends Colony Delhi NCR
Elite Class āž„8448380779ā–» Call Girls In New Friends Colony Delhi NCRDelhi Call girls
Ā 
VIP Chandigarh Call Girls Service 7001035870 Enjoy Call Girls With Our Escorts
VIP Chandigarh Call Girls Service 7001035870 Enjoy Call Girls With Our EscortsVIP Chandigarh Call Girls Service 7001035870 Enjoy Call Girls With Our Escorts
VIP Chandigarh Call Girls Service 7001035870 Enjoy Call Girls With Our Escortssonatiwari757
Ā 
Ready to get noticed? Partner with Sociocosmos
Ready to get noticed? Partner with SociocosmosReady to get noticed? Partner with Sociocosmos
Ready to get noticed? Partner with SociocosmosSocioCosmos
Ā 
Enjoy Nightāš”Call Girls Palam Vihar Gurgaon >ą¼’8448380779 Escort Service
Enjoy Nightāš”Call Girls Palam Vihar Gurgaon >ą¼’8448380779 Escort ServiceEnjoy Nightāš”Call Girls Palam Vihar Gurgaon >ą¼’8448380779 Escort Service
Enjoy Nightāš”Call Girls Palam Vihar Gurgaon >ą¼’8448380779 Escort ServiceDelhi Call girls
Ā 
Craft Your Legacy: Invest in YouTube Presence from Sociocosmos"
Craft Your Legacy: Invest in YouTube Presence from Sociocosmos"Craft Your Legacy: Invest in YouTube Presence from Sociocosmos"
Craft Your Legacy: Invest in YouTube Presence from Sociocosmos"SocioCosmos
Ā 
Elite Class āž„8448380779ā–» Call Girls In Nizammuddin Delhi NCR
Elite Class āž„8448380779ā–» Call Girls In Nizammuddin Delhi NCRElite Class āž„8448380779ā–» Call Girls In Nizammuddin Delhi NCR
Elite Class āž„8448380779ā–» Call Girls In Nizammuddin Delhi NCRDelhi Call girls
Ā 
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...Nitya salvi
Ā 
O9654467111 Call Girls In Dwarka Women Seeking Men
O9654467111 Call Girls In Dwarka Women Seeking MenO9654467111 Call Girls In Dwarka Women Seeking Men
O9654467111 Call Girls In Dwarka Women Seeking MenSapana Sha
Ā 
SELECTING A SOCIAL MEDIA MARKETING COMPANY
SELECTING A SOCIAL MEDIA MARKETING COMPANYSELECTING A SOCIAL MEDIA MARKETING COMPANY
SELECTING A SOCIAL MEDIA MARKETING COMPANYdizinfo
Ā 
Lucknow šŸ’‹ Dating Call Girls Lucknow | Whatsapp No 8923113531 VIP Escorts Serv...
Lucknow šŸ’‹ Dating Call Girls Lucknow | Whatsapp No 8923113531 VIP Escorts Serv...Lucknow šŸ’‹ Dating Call Girls Lucknow | Whatsapp No 8923113531 VIP Escorts Serv...
Lucknow šŸ’‹ Dating Call Girls Lucknow | Whatsapp No 8923113531 VIP Escorts Serv...anilsa9823
Ā 
Improve Your Brand in Waco with a Professional Social Media Marketing Company
Improve Your Brand in Waco with a Professional Social Media Marketing CompanyImprove Your Brand in Waco with a Professional Social Media Marketing Company
Improve Your Brand in Waco with a Professional Social Media Marketing CompanyWSI INTERNET PARTNER
Ā 
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...SocioCosmos
Ā 
Film show pre-production powerpoint for site
Film show pre-production powerpoint for siteFilm show pre-production powerpoint for site
Film show pre-production powerpoint for siteAshtonCains
Ā 

Recently uploaded (20)

Russian Call Girls Rohini Sector 37 šŸ’“ Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Rohini Sector 37 šŸ’“ Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Rohini Sector 37 šŸ’“ Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Rohini Sector 37 šŸ’“ Delhi 9999965857 @Sabina Modi VVIP MODE...
Ā 
Film the city investagation powerpoint :)
Film the city investagation powerpoint :)Film the city investagation powerpoint :)
Film the city investagation powerpoint :)
Ā 
šŸ”9953056974 šŸ”Call Girls In Mehrauli Escort Service Delhi NCR
šŸ”9953056974 šŸ”Call Girls In Mehrauli  Escort Service Delhi NCRšŸ”9953056974 šŸ”Call Girls In Mehrauli  Escort Service Delhi NCR
šŸ”9953056974 šŸ”Call Girls In Mehrauli Escort Service Delhi NCR
Ā 
Night 7k Call Girls Atta Market Escorts Call Me: 8448380779
Night 7k Call Girls Atta Market Escorts Call Me: 8448380779Night 7k Call Girls Atta Market Escorts Call Me: 8448380779
Night 7k Call Girls Atta Market Escorts Call Me: 8448380779
Ā 
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFECASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
CASH PAYMENT ON GIRL HAND TO HAND HOUSEWIFE
Ā 
This is a Powerpoint about research into the codes and conventions of a film ...
This is a Powerpoint about research into the codes and conventions of a film ...This is a Powerpoint about research into the codes and conventions of a film ...
This is a Powerpoint about research into the codes and conventions of a film ...
Ā 
Elite Class āž„8448380779ā–» Call Girls In New Friends Colony Delhi NCR
Elite Class āž„8448380779ā–» Call Girls In New Friends Colony Delhi NCRElite Class āž„8448380779ā–» Call Girls In New Friends Colony Delhi NCR
Elite Class āž„8448380779ā–» Call Girls In New Friends Colony Delhi NCR
Ā 
VIP Chandigarh Call Girls Service 7001035870 Enjoy Call Girls With Our Escorts
VIP Chandigarh Call Girls Service 7001035870 Enjoy Call Girls With Our EscortsVIP Chandigarh Call Girls Service 7001035870 Enjoy Call Girls With Our Escorts
VIP Chandigarh Call Girls Service 7001035870 Enjoy Call Girls With Our Escorts
Ā 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Masudpur
Delhi  99530 vip 56974  Genuine Escort Service Call Girls in MasudpurDelhi  99530 vip 56974  Genuine Escort Service Call Girls in Masudpur
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Masudpur
Ā 
Ready to get noticed? Partner with Sociocosmos
Ready to get noticed? Partner with SociocosmosReady to get noticed? Partner with Sociocosmos
Ready to get noticed? Partner with Sociocosmos
Ā 
Enjoy Nightāš”Call Girls Palam Vihar Gurgaon >ą¼’8448380779 Escort Service
Enjoy Nightāš”Call Girls Palam Vihar Gurgaon >ą¼’8448380779 Escort ServiceEnjoy Nightāš”Call Girls Palam Vihar Gurgaon >ą¼’8448380779 Escort Service
Enjoy Nightāš”Call Girls Palam Vihar Gurgaon >ą¼’8448380779 Escort Service
Ā 
Craft Your Legacy: Invest in YouTube Presence from Sociocosmos"
Craft Your Legacy: Invest in YouTube Presence from Sociocosmos"Craft Your Legacy: Invest in YouTube Presence from Sociocosmos"
Craft Your Legacy: Invest in YouTube Presence from Sociocosmos"
Ā 
Elite Class āž„8448380779ā–» Call Girls In Nizammuddin Delhi NCR
Elite Class āž„8448380779ā–» Call Girls In Nizammuddin Delhi NCRElite Class āž„8448380779ā–» Call Girls In Nizammuddin Delhi NCR
Elite Class āž„8448380779ā–» Call Girls In Nizammuddin Delhi NCR
Ā 
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Pondicherry Call Girls Book Now 8617697112 Top Class Pondicherry Escort Servi...
Ā 
O9654467111 Call Girls In Dwarka Women Seeking Men
O9654467111 Call Girls In Dwarka Women Seeking MenO9654467111 Call Girls In Dwarka Women Seeking Men
O9654467111 Call Girls In Dwarka Women Seeking Men
Ā 
SELECTING A SOCIAL MEDIA MARKETING COMPANY
SELECTING A SOCIAL MEDIA MARKETING COMPANYSELECTING A SOCIAL MEDIA MARKETING COMPANY
SELECTING A SOCIAL MEDIA MARKETING COMPANY
Ā 
Lucknow šŸ’‹ Dating Call Girls Lucknow | Whatsapp No 8923113531 VIP Escorts Serv...
Lucknow šŸ’‹ Dating Call Girls Lucknow | Whatsapp No 8923113531 VIP Escorts Serv...Lucknow šŸ’‹ Dating Call Girls Lucknow | Whatsapp No 8923113531 VIP Escorts Serv...
Lucknow šŸ’‹ Dating Call Girls Lucknow | Whatsapp No 8923113531 VIP Escorts Serv...
Ā 
Improve Your Brand in Waco with a Professional Social Media Marketing Company
Improve Your Brand in Waco with a Professional Social Media Marketing CompanyImprove Your Brand in Waco with a Professional Social Media Marketing Company
Improve Your Brand in Waco with a Professional Social Media Marketing Company
Ā 
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
Unlock the power of Instagram with SocioCosmos. Start your journey towards so...
Ā 
Film show pre-production powerpoint for site
Film show pre-production powerpoint for siteFilm show pre-production powerpoint for site
Film show pre-production powerpoint for site
Ā 

2206 FAccT_inperson

  • 1. Building a Dataset to Measure Toxicity and Social Bias within Language: A Low-Resource Perspective Won Ik Cho (SNU ECE) 2022. 6. 22 @FAccT, Seoul, Korea
  • 2. Introduction ā€¢ CHO, Won Ik (ģ”°ģ›ģµ) ļ‚§ B.S. in EE/Mathematics (SNU, ā€™10~ā€™14) ļ‚§ Ph.D. student (SNU ECE, ā€™14~) ā€¢ Academic interests ļ‚§ Built Korean NLP datasets on various spoken language understanding areas ļ‚§ Currently interested in computational approaches of: ā€¢ Dialogue analysis ā€¢ AI for social good 1
  • 3. Contents ā€¢ Introduction ā€¢ Hate speech in real and cyber spaces ļ‚§ What is hate speech and why does it matter? ļ‚§ Study on hate speech detection ā€¢ In English ā€“ Dataset and analysis ā€¢ Notable approaches in other languages ā€¢ Low-resource perspective: Creating a hate speech corpus from scratch ļ‚§ Analysis on existing language resources ļ‚§ Hate speech as bias detection and toxicity measurement ļ‚§ Building a guideline for data annotation ļ‚§ Worker pilot, crowdsourcing, and agreement ā€¢ Challenges of hate speech corpus construction ā€¢ Conclusion 2
  • 4. Contents Caution! This presenation may contain contents that can be offensive to certain groups of people, such as gender bias, racism, or other unethical contents including multimodal materials 3
  • 5. Contents ā€¢ Handled in this tutorial ļ‚§ How to build up a hate speech detection dataset in a specific setting (language, text domain, etc.) ļ‚§ How to check the validity of the created hate speech corpus ā€¢ Less handled in this tutorial ļ‚§ Comprehensive definition of hate speech and social bias in the literature ļ‚§ Reliability of specific ethical guideline for hate speech corpus construction 4
  • 6. Hate speech in real and cyber spaces ā€¢ What is hate speech and why does it matter? ļ‚§ Difficulty of defining hate speech ā€¢ Political and legal term, and not just a theoretical term ā€¢ Has no unified/universal definition accepted to all ā€¢ Definition differs upon language, culture, domain, discipline, etc. ļ‚§ Definition given by United Nations ā€¢ ā€œAny kind of communication in speech, writing or behaviour, that attacks or uses pejorative or discriminatory language with reference to a person or a group on the basis of who they are, in other words, based on their religion, ethnicity, nationality, race, colour, descent, gender or other identity factor.ā€ ā€“ Not a legal definition ā€“ Broader than the notion of ā€œincitement to discrimination, hostility or violenceā€ prohibited under international human rights law 5 https://www.un.org/en/hate-speech/understanding-hate-speech/what-is-hate-speech?
  • 7. Hate speech in real and cyber spaces ā€¢ What is hate speech and why does it matter? ļ‚§ Hate speech in cyber spaces ā€¢ Definition is deductive, but its detection is inductive ā€¢ Hate speech appears online as various expressions, including: ā€“ Offensive language ā€“ Pejorative expressions ā€“ Discriminative words ā€“ Profanity terms ā€“ Insulting ... etc. ā€¢ Whether to include specific terms or expressions in the category of `hate speechā€™ is a tricky issue ā€“ What if pejorative expression or profanity term does not target any group or individuals? ā€“ What if a (sexual) harrassment is considered offensive to readers but not for the target figure? 6
  • 8. Hate speech in real and cyber spaces ā€¢ Discussion on hate speech detection ļ‚§ Studies for English ā€¢ Waseem and Hovy (2016) ā€“ Annotates tweets upon around 10 features that make the post offensive 7 A tweet is offensive if it 1. uses a sexist or racial slur. 2. attacks a minority. 3. seeks to silence a minority. 4. criticizes a minority (without a well founded argument). 5. promotes, but does not directly use, hate speech or violent crime. 6. criticizes a minority and uses a straw man argument. 7. blatantly misrepresents truth or seeks to distort views on a minority with unfounded claims. 8. shows support of problematic hash tags. E.g. ā€œ#BanIslamā€, ā€œ#whorientalā€, ā€œ#whitegenocideā€ 9. negatively stereotypes a minority. 10. defends xenophobia or sexism. 11. contains a screen name that is offensive, as per the previous criteria, the tweet is ambiguous (at best), and the tweet is on a topic that satisfies any of the above criteria Waseem and Hovy, Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter, 2016.
  • 9. Hate speech in real and cyber spaces ā€¢ Discussion on hate speech detection ļ‚§ Studies for English ā€¢ Davidson et al. (2017) ā€“ Mentions the discrepancy between the theoretical definition and real world expressions of hate speech ā€“ Puts `offensiveā€™ expressions in between `hateā€™ and `non-hateā€™, to incorporate the expressions that are in the grey area ā€“ Incorporate profanity terms used prevalent in social media, which does not necessarily targets minority but induces offensiveness 8 Davidson et al., Automated Hate Speech Detection and the Problem of Offensive Language, 2017.
  • 10. Hate speech in real and cyber spaces ā€¢ Discussion on hate speech detection ļ‚§ Notable approaches in other languages ā€¢ Sanguinetti et al. (2018) ā€“ Investigates hate speech for the posts on Italian immigrants ā€“ Beyond hate speech, tags if the post is offensive, aggressive, intensive, has irony and sarcasm, shows stereotype ā€“ `Stereotypeā€™ as a factor that can be a clue to discrimination 9 Sanguinetti et al., An Italian Twitter Corpus of Hate Speech against Immigrants, 2018. ā€¢ hate speech: no - yes ā€¢ aggressiveness: no - weak ā€“ strong ā€¢ offensiveness: no - weak - strong ā€¢ irony: no - yes ā€¢ stereotype: no - yes ā€¢ intensity: 0 - 1 - 2 - 3 - 4
  • 11. Hate speech in real and cyber spaces ā€¢ Discussion on hate speech detection ļ‚§ Notable approaches in other languages ā€¢ Assimakopoulos et al. (2020) ā€“ Motivated by the critical analysis of posts made in reaction to news reports on the Mediterranean migration crisis and LGBTIQ+ matters in Malta ā€“ Annotates Malta web texts ā€“ Investigates the attitude (positive/negative) of the text, and asks for target if negative, also asking the way the negativeness is conveyed 10 1. Does the post communicate a positive, negative or neutral attitude? [Positive / Negative / Neutral] 2. If negative, who does this attitude target? [Individual / Group] ā€¢ (a) If it targets an individual, does it do so because of the individualā€™s affiliation to a group? [Yes / No] If yes, name the group. ā€¢ (b) If it targets a group, name the group. 3. How is the attitude expressed in relation to the target group? Select all that apply. [ Derogatory term / Generalisation / Insult / Sarcasm (including jokes and trolling) / Stereotyping / Suggestion / Threat ] 4. If the post involves a suggestion, is it a suggestion that calls for violence against the target group? [Yes / No] Assimakopoulos et al., Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis, 2020.
  • 12. Hate speech in real and cyber spaces ā€¢ Discussion on hate speech detection ļ‚§ Notable approaches in other languages ā€¢ Moon et al. (2020) ā€“ Annotation on Korean celebrity news comments ā€“ Investigate the existence of social bias and the degree of toxicity Ā» Social bias ā€“ Gender-related bias and other biases Ā» Toxicity ā€“ Hate/Offensive/None (following Davidson et al. 2017) 11 Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020. Detecting social bias ā€¢ Is there a gender-related bias, either explicit or implicit, in the text? ā€¢ Are there any other kinds of bias in the text? ā€¢ A comment that does not incorporate the bias Measuring toxicity ā€¢ Is strong hate or insulting towards the articleā€™s target or related figures, writers of the article or comments, etc. displayed in a comment? ā€¢ Although a comment is not as much hateful or insulting as the above, does it make the target or the reader feel offended? ā€¢ A comment that does not incorporate any hatred or insulting
  • 13. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ ASSUMPTION: There is no manually created hate speech detection corpus so far for the Korean language (was true before July 2020...) ā€¢ Generally, clear motivation is required for hate speech corpus construction ā€“ Why? Ā» Takes resources (time and money) Ā» Potential mental harm Ā» Potential attack towards the researchers ā€“ Nonetheless, it is required in some circumstances Ā» Detecting offensive language in services Ā» Severe harm has been displayed publicly 12
  • 14. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 1: Is there anything available? Analysis on existing language resources ā€¢ Language resources on hate speech detection regards various other similar datasets (though slightly different in definition and goal) ā€“ Dictionary of profanity terms (e.g., hatebase.org) ā€“ Sarcasm detection dataset ā€“ Sentiment analysis dataset ā€“ Offensive language detection dataset ā€¢ Why we should search existing resources? ā€“ To lessen the consumption of time and money ā€“ To make the problem easier by building upon existing dataset ā€“ To confirm what we should aim by creating a new dataset 13
  • 15. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 1: Is there anything available? Analysis on existing language resources ā€¢ Dictionary of profanity terms ā€“ e.g., https://github.com/doublems/korean-bad-words ā€¢ Sarcasm detection dataset ā€“ e.g, https://github.com/SpellOnYou/korean-sarcasm ā€¢ Sentiment analysis dataset ā€“ e.g., https://github.com/e9t/nsmc ļ‚§ The datasets may not completely overlap with hate speech corpus, but at least they can be a good source of annotation ļŠ ā€¢ Here, one should think of: ā€“ Text style ā€“ Text domain ā€“ Appearing types of toxicity and bias 14
  • 16. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 1: Is there anything available? Analysis on existing language resources ā€¢ Text style ā€“ Written/spoken/web text? ā€¢ Text domain ā€“ News/wiki/tweets/chat/comments? ā€¢ Appearing types of toxicity and bias ā€“ Gender-related? ā€“ Politics/religion? ā€“ Region/nationality/ethnicity? ā€¢ Appearing amount of toxicity and bias 15
  • 17. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 1: Is there anything available? Analysis on existing language resources ā€¢ Data collection example (BEEP!) ā€“ Comments from the most popular Korean entertainment news platform Ā» Jan. 2018 ~ Feb. 2020 Ā» 10,403,368 comments from 23,700 articles Ā» 1,580 articles acquired by stratified sampling Ā» Top 20 comments in the order of Wilson score on the downvote for each article ā€“ Filter the duplicates and leave comments having more than single token and less than 100 characters ā€“ 10K comments were selected ā€¢ Data sampling process matters much in the final distribution of the dataset! 16
  • 18. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 2: What should we define first? Hate speech as bias detection and toxicity measurement ā€¢ Local definition of hate speech discussed by Korean sociolinguistics society ā€“ Definition of hate speech Ā» Expressions that discriminate/hate or incite discrimination/hatred/violence towards some individual or group of people because they have characteristics as a social minority ā€“ Types of hate speech Ā» Discriminative bullying Ā» Discrimination Ā» Public insult/threatening Ā» Inciting hatred 17 Hong et al., Study on the State and Regulation of Hate Speech, 2016.
  • 19. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 2: What should we define first? Hate speech as bias detection and toxicity measurement ā€¢ Set up criteria ā€“ Analyze ā€˜Discriminate/hate or incite discrimination/hatred/violenceā€™ as a combination of ā€˜Social biasā€™ and ā€™Toxicityā€™ ā€“ Further discussion required on social minority Ā» `Gender, age, profession, religion, nationality, skin color, political stanceā€™ and all other factors that comprises oneā€™s identity Ā» Criteria for social minority vs. Who will be acknowledged as social minority 18
  • 20. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 2: What should we define first? Hate speech as bias detection and toxicity measurement ā€¢ Set up criteria for bias detection ā€“ `People with a specific characteristic may behave in some wayā€™ ā€“ Differs from the judgment Ā» Gender-related bias Ā» Other biases Ā» None 19 Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments, 2021.
  • 21. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 2: What should we define first? Hate speech as bias detection and toxicity measurement ā€¢ Set up criteria for toxicity measurement ā€“ Hate Ā» Hostility towards a specific group or individual Ā» Can be represented by some profanity terms, but terms do not imply hate ā€“ Insult Ā» Expressions that can harm the prestige of individuals or group Ā» Various profanity terms are included ā€“ Offensive expressions Ā» Does not count as hate or insult, but may make the readers offensive Ā» Includes sarcasm, irony, bad guessing, unethical expressions 20
  • 22. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 2: What should we define first? Hate speech as bias detection and toxicity measurement ā€¢ Set up criteria for toxicity measurement Ā» Severe hate or insult Ā» Not hateful but offensive or sarcastic Ā» None 21 Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments, 2021.
  • 23. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 3: What is required for the annotation? Building a guideline for data annotation ā€¢ Stakeholders ā€“ Researchers ā€“ Moderators (crowdsourcing platform) ā€“ Workers ā€¢ How is guideline used as? ā€“ Setting up research direction (for researchers) ā€“ Task understanding (for moderators) ā€“ Data annotation (for workers) 22
  • 24. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 3: What is required for the annotation? Building a guideline for data annotation ā€¢ Guideline is not built at once! ā€“ Usual process Ā» Making up draft guideline based on source corpus Ā» Pilot study of researchers & guideline update (š‘ times iteration) Ā» Moderatorsā€™ and researchersā€™ alignment on the guideline Ā» Worker recruitment & pilot tagging Ā» Guideline update with worker feedback (cautions & exceptions) Ā» Final guideline (for main annotation) 23
  • 25. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 3: What is required for the annotation? Building a guideline for data annotation ā€¢ Draft guideline ā€“ Built based upon a small portion of source corpus (about hundreds of instances) ā€“ Researchersā€™ intuition is highly involved in ā€“ Concept-based description Ā» e.g., for `biasā€™, `People with a specific characteristic may behave in some wayā€™ (instead of listing up all stereotyped expressions) ā€¢ Pilot study ā€“ Researchersā€™ tagging on slightly larger portion of source corpus (~1K instances) ā€“ Fitting researchersā€™ intuition on the proposed concepts Ā» e.g., ``Does this expression contain bias or toxicity?ā€™ā€™ (discussion is important, but donā€™t fight!) ā€“ Update descriptions or add examples ā€“ Labeling, re-labeling, re-re-labeling... 24
  • 26. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 3: What is required for the annotation? Building a guideline for data annotation ā€¢ Pilot study ā€“ Labeling, re-labeling, re-re-labeling... + Agreement? ā€“ Inter-annotator agreement (IAA) Ā» Calculating the reliability of annotation Ā» Cohenā€™s Kappa for two annotators Ā» Fleissā€™ Kappa for more than two annotators ā€“ Sufficiently high agreement? (> 0.6?) Ā» Letā€™s go annotating in the wild! 25 Pustejovsky and Stubbs, Natural Language Annotation, 2012.
  • 27. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement ā€¢ Finding a crowdsourcing platform ā€“ Moderator Ā» Usually an expert in data creation and management Ā» Comprehends the task, gives feedback in view of workers Ā» Helps communication between researchers and workers Ā» Instructs, and sometimes hurries workers to meet the timeline Ā» Manages financial or legal issues Ā» Let researchers concentrate on the task itself ā€“ Without moderator? Ā» Researchers are the moderator! (Unless there are some automated functions in the platform) ā€“ With moderator? Ā» The closest partner of researchers 26
  • 28. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement ā€¢ Finding a crowdsourcing platform ā€“ Existence and experience of the moderator Ā» Experience of similar dataset construction Ā» Comprehension of the task & proper feedbacks Ā» Sufficient worker pool Ā» Trust between the moderator and workers ā€“ Reasonable cost estimation Ā» Appropriateness of price per tagging or reviewing Ā» Appropriateness of worker compensation Ā» Fit with the budget 27
  • 29. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement ā€¢ Finding a crowdsourcing platform ā€“ Usefulness of the platform UI Ā» Progress status (In progress, Submitted, Waiting for reviewing... etc.) Ā» Statistics: The number of workers and reviewers, Average work/review duration... Ā» Demographics, Worker history by individuals & in total... 28
  • 30. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement ā€¢ Pilot tagging (by workers) ā€“ Goal of worker pilot Ā» Guideline update in workersā€™ view (especially on cautions & exceptions) Ā» Worker selection ā€“ Procedure Ā» Advertisement or recruitment Ā» Worker tagging Ā» Researchersā€™ (or moderatorsā€™) review & rejection Ā» Workersā€™ revise & resubmit 29
  • 31. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement ā€¢ Details on worker selection process? ā€“ Human checking Ā» Ethical standard not too far from the guideline? Ā» Is feedback effective for the rejected samples? ā€“ Automatic checking Ā» Enough taggings done? Ā» Too frequent cases of skipping the annotation? 30 UI screenshots provided by Deep Natural AI.
  • 32. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement ā€¢ Crowdsourcing: A simplified version is required for crowd annotation! ā€“ Multi-class, multi-attribute tagging Ā» 3 classes for bias Ā» 3 classes for toxicity ā€“ Given a comment (without context), the annotator should tag each attribute ā€“ Detailed guideline (with examples, cautions, and exceptions) is provided separately 31 1. What kind of bias does the comment contain? - Gender bias, Other biases, or None 2. Which is the adequate category for the comment in terms of toxicity? - Hate, Offensive, or None
  • 33. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement ā€¢ Main annotation ā€“ Based on the final version of the guideline Ā» 3~5 annotators (per sample) for usual classification tasks ā€“ Tagging done by selected workers Ā» Worker selection and education Ā» Short quiz (if workers are not selected) ā€“ Annotation toolkit Ā» Assign samples randomly to workers, with multiple annotators per sample Ā» Interface developed or provided by the platform (usually takes budget) Ā» Open-source interfaces (e.g., Labelstudio) ā€“ Data check for further guarantee of quality Ā» If sufficiently many annotators per sample? Ā» If not...? 32
  • 34. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement ā€¢ Data selection after main annotation (8,000 samples) ā€“ Data reviewing strategy may differ by subtask ā€“ Researchers decide the final label after adjudication ā€“ Common for bias and toxicity Ā» Cases with all three annotators differ ā€“ Only for toxicity Ā» Since the problem regards the continuum of degree, cases with only hate (o) and none (x) need to be investigated again ā€“ Failure for decision (unable to majority vote) ā€“ discarded 33
  • 35. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 4: How is the annotation process conducted and evaluated? Worker pilot, crowdsourcing, and agreement ā€¢ Final decision ā€“ Test: 974 Ā» Data tagged while constructing the guideline (Mostly adjust to the intention of the guideline) ā€“ Validation: 471 Ā» Data which went through tag/review/reject and accept in the pilot phase, done with a large number of annotators (Roughly aligned with the guideline) ā€“ Train: 7,896 Ā» Data which were crowd-sourced with the selected workers, not reviewed totally but went through adjudication only for some special cases ā€¢ Agreement ā€“ 0.492 for bias detection, 0.496 for toxicity measurement 34 Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020.
  • 36. Low-resource perspective ā€¢ Creating a hate speech corpus from scratch ļ‚§ Step 4: How is the annotation process conducted and evaluated? Beyond creation - Model training and deployment ā€¢ Model training ā€“ Traditionally Ā» High performance ā€“ relatively easy? Ā» Low performance ā€“ relatively challenging? ā€“ But in PLM-based training these days... Ā» Pretraining corpora Ā» Model size Ā» Model architecture ā€“ Model deployment Ā» Performance & size Ā» User feedbacks 35 Yang, Transformer-based Korean Pretrained Language Models: A Survey on Three Years of Progress, 2021.
  • 37. Challenges ā€¢ Challenges of hate speech corpus construction ļ‚§ Context-dependency ā€¢ News comment ā€“ articles ā€¢ Tweets ā€“ threads ā€¢ Web community comments ā€“ posts ļ‚§ Multi-modal or noisy inputs ā€¢ Image and audio ā€“ Kiela et al. (2020) - Hateful memes challenge ā€¢ Perturbated texts ā€“ Cho and Kim (2021) - Leetcodes - Yaminjeongeum 36 Kiela et al., The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes, 2020. Cho and Kim, Google-trickers, Yaminjeongeum, and Leetspeak: An Empirical Taxonomy for Intentionally Noisy User-Generated Text, 2021.
  • 38. Challenges ā€¢ Challenges of hate speech corpus construction ļ‚§ Categorical or binary output has limitation ā€¢ Limitation of categorizing the degree of intensity ā€“ Hate/offensive/none categorization is sub-optimal ā€“ Polleto et al. (2019) Scale-based annotation: Unbalanced Rating Scale Ā» Used to determine the label (or used as a target score?) 37 Poletto et al., Annotating Hate Speech: Three Schemes at Comparison, 2019.
  • 39. Challenges ā€¢ Challenges of hate speech corpus construction ļ‚§ Annotation requires multiple label ā€¢ Aspect of discrimination may differ by attributes ā€“ Gender, Race, Nationality, Ageism ... ā€¢ Tagging `all the target attributesā€™ that appear? ā€“ Kang et al. (2022) Ā» Detailed guideline with terms and concepts defined for each atttribute 38 Women & family Male Sexual minorities Race & nationality Ageism Regionalism Religion Other Malicious None S1 1 0 0 0 1 0 0 0 0 0 S2 0 0 0 0 0 0 0 0 1 0 S3 0 0 0 1 0 0 1 0 0 0 S4 0 0 0 0 0 0 0 0 0 1 Kang et al., Korean Online Hate Speech Dataset for Multilabel Classification - How Can Social Science Improve Dataset on Hate Speech?, 2022.
  • 40. Challenges ā€¢ Challenges of hate speech corpus construction ļ‚§ Privacy and license issues ā€¢ Privacy and license can be violated with the text crawling ā€¢ Hate speech corpus may contain personal information on (public) figures ā€¢ Text could have been brought from elsewhere (copy & paste) ļ‚§ How about creating hate (and non-hate) speech from scratch? ā€¢ Yang et al. (2022): Recruit workers and enable `anonymousā€™ text generation! 39 Yang et al., APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets, 2022.
  • 41. Challenges ā€¢ Ambiguity is inevitable ļ‚§ Text may incorporate various ways of interpretation ā€¢ Text accompanies omission or replacement to trick the monitoring ā€¢ Intention is apparent considering the context ā€¢ Temporal diachronicity of hate speech ļ‚§ Non-hate speech in the past can be interpreted as hate speech these days ļ‚§ Diachronicity may deter the utility of prediction systems ā€¢ e.g., [a name of celebrity who commited crime] before 20xx / after 20xx ā€¢ Boundary of hate speech and freedom of speech ļ‚§ Grey area that cannot be resolved ā€¢ Some readers are offended by false positives ā€¢ Some users are offended by false negatives 40
  • 42. Conclusion ā€¢ Hate speech prevalent in real and cyber spaces ļ‚§ Discussions on hate speech have diverse viewpoints, from academia to society and industry ā€“ and they are reflected to the dataset construction ā€¢ No corpus is built perfect from the beginning ļ‚§ ... and hate speech is one of the most difficult kind of corpus to create ā€¢ Considerations in low-resource hate speech corpus construction ļ‚§ Why? How? How much? How well? ā€¢ Still more challenges left ļ‚§ Context, input noise, output format, indecisiveness ... ā€¢ Takeaways ļ‚§ There is discrepancy between theoretical and practical definition of hate speech, and their aim may differ ļ‚§ There is no hate speech detection guideline that satisfies ALL, so letā€™s find the boundary that satisfies the most and improve it 41
  • 43. Reference ā€¢ Waseem and Hovy, Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter, 2016. ā€¢ Davidson et al., Automated Hate Speech Detection and the Problem of Offensive Language, 2017. ā€¢ Sanguinetti et al., An Italian Twitter Corpus of Hate Speech against Immigrants, 2018. ā€¢ Assimakopoulos et al., Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis, 2020. ā€¢ Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020. ā€¢ Hong et al., Study on the State and Regulation of Hate Speech, 2016. ā€¢ Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments, 2021. ā€¢ Pustejovsky and Stubbs, Natural Language Annotation, 2012. ā€¢ Yang, Transformer-based Korean Pretrained Language Models: A Survey on Three Years of Progress, 2021. ā€¢ Kiela et al., The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes, 2020. ā€¢ Cho and Kim, Google-trickers, Yaminjeongeum, and Leetspeak: An Empirical Taxonomy for Intentionally Noisy User-Generated Text, 2021. ā€¢ Poletto et al., Annotating Hate Speech: Three Schemes at Comparison, 2019. ā€¢ Kang et al., Korean Online Hate Speech Dataset for Multilabel Classification - How Can Social Science Improve Dataset on Hate Speech?, 2022. ā€¢ Yang et al., APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets, 2022. 42