1. Building a Dataset to Measure
Toxicity and Social Bias within Language:
A Low-Resource Perspective
Won Ik Cho (SNU ECE)
2022. 6. 22 @FAccT, Seoul, Korea
2. Introduction
ā¢ CHO, Won Ik (ģ”°ģģµ)
ļ§ B.S. in EE/Mathematics (SNU, ā10~ā14)
ļ§ Ph.D. student (SNU ECE, ā14~)
ā¢ Academic interests
ļ§ Built Korean NLP datasets on various
spoken language understanding areas
ļ§ Currently interested in computational
approaches of:
ā¢ Dialogue analysis
ā¢ AI for social good
1
3. Contents
ā¢ Introduction
ā¢ Hate speech in real and cyber spaces
ļ§ What is hate speech and why does it matter?
ļ§ Study on hate speech detection
ā¢ In English ā Dataset and analysis
ā¢ Notable approaches in other languages
ā¢ Low-resource perspective: Creating a hate speech corpus from
scratch
ļ§ Analysis on existing language resources
ļ§ Hate speech as bias detection and toxicity measurement
ļ§ Building a guideline for data annotation
ļ§ Worker pilot, crowdsourcing, and agreement
ā¢ Challenges of hate speech corpus construction
ā¢ Conclusion
2
4. Contents
Caution! This presenation may contain contents that can be offensive to
certain groups of people, such as gender bias, racism, or other
unethical contents including multimodal materials
3
5. Contents
ā¢ Handled in this tutorial
ļ§ How to build up a hate speech detection dataset in a specific setting
(language, text domain, etc.)
ļ§ How to check the validity of the created hate speech corpus
ā¢ Less handled in this tutorial
ļ§ Comprehensive definition of hate speech and social bias in the literature
ļ§ Reliability of specific ethical guideline for hate speech corpus construction
4
6. Hate speech in real and cyber spaces
ā¢ What is hate speech and why does it matter?
ļ§ Difficulty of defining hate speech
ā¢ Political and legal term, and not just a theoretical term
ā¢ Has no unified/universal definition accepted to all
ā¢ Definition differs upon language, culture, domain, discipline, etc.
ļ§ Definition given by United Nations
ā¢ āAny kind of communication in speech, writing or behaviour, that attacks or
uses pejorative or discriminatory language with reference to a person or a
group on the basis of who they are, in other words, based on their religion,
ethnicity, nationality, race, colour, descent, gender or other identity factor.ā
ā Not a legal definition
ā Broader than the notion of āincitement to discrimination, hostility or violenceā
prohibited under international human rights law
5
https://www.un.org/en/hate-speech/understanding-hate-speech/what-is-hate-speech?
7. Hate speech in real and cyber spaces
ā¢ What is hate speech and why does it matter?
ļ§ Hate speech in cyber spaces
ā¢ Definition is deductive, but its detection is inductive
ā¢ Hate speech appears online as various expressions, including:
ā Offensive language
ā Pejorative expressions
ā Discriminative words
ā Profanity terms
ā Insulting ... etc.
ā¢ Whether to include specific terms or expressions in the category of `hate
speechā is a tricky issue
ā What if pejorative expression or profanity term does not target any group or
individuals?
ā What if a (sexual) harrassment is considered offensive to readers but not for the
target figure?
6
8. Hate speech in real and cyber spaces
ā¢ Discussion on hate speech detection
ļ§ Studies for English
ā¢ Waseem and Hovy (2016)
ā Annotates tweets upon around 10 features that make the post offensive
7
A tweet is offensive if it
1. uses a sexist or racial slur.
2. attacks a minority.
3. seeks to silence a minority.
4. criticizes a minority (without a well founded argument).
5. promotes, but does not directly use, hate speech or violent crime.
6. criticizes a minority and uses a straw man argument.
7. blatantly misrepresents truth or seeks to distort views on a minority with unfounded claims.
8. shows support of problematic hash tags. E.g. ā#BanIslamā, ā#whorientalā, ā#whitegenocideā
9. negatively stereotypes a minority.
10. defends xenophobia or sexism.
11. contains a screen name that is offensive, as per the previous criteria, the tweet is
ambiguous (at best), and the tweet is on a topic that satisfies any of the above criteria
Waseem and Hovy, Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter, 2016.
9. Hate speech in real and cyber spaces
ā¢ Discussion on hate speech detection
ļ§ Studies for English
ā¢ Davidson et al. (2017)
ā Mentions the discrepancy between the theoretical definition and real world
expressions of hate speech
ā Puts `offensiveā expressions in between `hateā and `non-hateā, to incorporate the
expressions that are in the grey area
ā Incorporate profanity terms used
prevalent in social media, which
does not necessarily targets minority
but induces offensiveness
8
Davidson et al., Automated Hate Speech Detection and the Problem of Offensive Language, 2017.
10. Hate speech in real and cyber spaces
ā¢ Discussion on hate speech detection
ļ§ Notable approaches in other languages
ā¢ Sanguinetti et al. (2018)
ā Investigates hate speech for the posts on Italian immigrants
ā Beyond hate speech, tags if the post is offensive, aggressive, intensive, has irony and
sarcasm, shows stereotype
ā `Stereotypeā as a factor that can be a clue to discrimination
9
Sanguinetti et al., An Italian Twitter Corpus of Hate Speech against Immigrants, 2018.
ā¢ hate speech: no - yes
ā¢ aggressiveness: no - weak ā strong
ā¢ offensiveness: no - weak - strong
ā¢ irony: no - yes
ā¢ stereotype: no - yes
ā¢ intensity: 0 - 1 - 2 - 3 - 4
11. Hate speech in real and cyber spaces
ā¢ Discussion on hate speech detection
ļ§ Notable approaches in other languages
ā¢ Assimakopoulos et al. (2020)
ā Motivated by the critical analysis of posts made in reaction to news reports on the
Mediterranean migration crisis and LGBTIQ+ matters in Malta
ā Annotates Malta web texts
ā Investigates the attitude (positive/negative) of the text, and asks for target if negative,
also asking the way the negativeness is conveyed
10
1. Does the post communicate a positive, negative or neutral attitude? [Positive / Negative / Neutral]
2. If negative, who does this attitude target? [Individual / Group]
ā¢ (a) If it targets an individual, does it do so because of the individualās affiliation to a group? [Yes / No]
If yes, name the group.
ā¢ (b) If it targets a group, name the group.
3. How is the attitude expressed in relation to the target group? Select all that apply.
[ Derogatory term / Generalisation / Insult / Sarcasm (including jokes and trolling) / Stereotyping /
Suggestion / Threat ]
4. If the post involves a suggestion, is it a suggestion that calls for violence against the target group? [Yes / No]
Assimakopoulos et al., Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis, 2020.
12. Hate speech in real and cyber spaces
ā¢ Discussion on hate speech detection
ļ§ Notable approaches in other languages
ā¢ Moon et al. (2020)
ā Annotation on Korean celebrity news comments
ā Investigate the existence of social bias and the degree of toxicity
Ā» Social bias ā Gender-related bias and other biases
Ā» Toxicity ā Hate/Offensive/None (following Davidson et al. 2017)
11
Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020.
Detecting social bias
ā¢ Is there a gender-related bias, either explicit or implicit, in the text?
ā¢ Are there any other kinds of bias in the text?
ā¢ A comment that does not incorporate the bias
Measuring toxicity
ā¢ Is strong hate or insulting towards the articleās target or related
figures, writers of the article or comments, etc. displayed in a
comment?
ā¢ Although a comment is not as much hateful or insulting as the
above, does it make the target or the reader feel offended?
ā¢ A comment that does not incorporate any hatred or insulting
13. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ ASSUMPTION: There is no manually created hate speech detection corpus
so far for the Korean language (was true before July 2020...)
ā¢ Generally, clear motivation is required for hate speech corpus construction
ā Why?
Ā» Takes resources (time and money)
Ā» Potential mental harm
Ā» Potential attack towards the researchers
ā Nonetheless, it is required in some circumstances
Ā» Detecting offensive language in services
Ā» Severe harm has been displayed publicly
12
14. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 1: Is there anything available?
Analysis on existing language resources
ā¢ Language resources on hate speech detection regards various other similar
datasets (though slightly different in definition and goal)
ā Dictionary of profanity terms (e.g., hatebase.org)
ā Sarcasm detection dataset
ā Sentiment analysis dataset
ā Offensive language detection dataset
ā¢ Why we should search existing resources?
ā To lessen the consumption of time and money
ā To make the problem easier by building upon existing dataset
ā To confirm what we should aim by creating a new dataset
13
15. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 1: Is there anything available?
Analysis on existing language resources
ā¢ Dictionary of profanity terms
ā e.g., https://github.com/doublems/korean-bad-words
ā¢ Sarcasm detection dataset
ā e.g, https://github.com/SpellOnYou/korean-sarcasm
ā¢ Sentiment analysis dataset
ā e.g., https://github.com/e9t/nsmc
ļ§ The datasets may not completely overlap with hate speech corpus, but at
least they can be a good source of annotation ļ
ā¢ Here, one should think of:
ā Text style
ā Text domain
ā Appearing types of toxicity and bias
14
16. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 1: Is there anything available?
Analysis on existing language resources
ā¢ Text style
ā Written/spoken/web text?
ā¢ Text domain
ā News/wiki/tweets/chat/comments?
ā¢ Appearing types of toxicity and bias
ā Gender-related?
ā Politics/religion?
ā Region/nationality/ethnicity?
ā¢ Appearing amount of toxicity and bias
15
17. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 1: Is there anything available?
Analysis on existing language resources
ā¢ Data collection example (BEEP!)
ā Comments from the most popular Korean entertainment news platform
Ā» Jan. 2018 ~ Feb. 2020
Ā» 10,403,368 comments from 23,700 articles
Ā» 1,580 articles acquired by stratified sampling
Ā» Top 20 comments in the order of Wilson score on the downvote for each article
ā Filter the duplicates and leave comments having more than single token and less
than 100 characters
ā 10K comments were selected
ā¢ Data sampling process matters much in the final distribution of the dataset!
16
18. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
ā¢ Local definition of hate speech discussed by Korean sociolinguistics society
ā Definition of hate speech
Ā» Expressions that discriminate/hate or incite discrimination/hatred/violence
towards some individual or group of people because they have characteristics
as a social minority
ā Types of hate speech
Ā» Discriminative bullying
Ā» Discrimination
Ā» Public insult/threatening
Ā» Inciting hatred
17
Hong et al., Study on the State and Regulation of Hate Speech, 2016.
19. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
ā¢ Set up criteria
ā Analyze āDiscriminate/hate or incite discrimination/hatred/violenceā as a combination
of āSocial biasā and āToxicityā
ā Further discussion required on social minority
Ā» `Gender, age, profession, religion, nationality, skin color, political stanceā and all
other factors that comprises oneās identity
Ā» Criteria for social minority vs. Who will be acknowledged as social minority
18
20. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
ā¢ Set up criteria for bias detection
ā `People with a specific characteristic may behave in some wayā
ā Differs from the judgment
Ā» Gender-related bias
Ā» Other biases
Ā» None
19
Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments, 2021.
21. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
ā¢ Set up criteria for toxicity measurement
ā Hate
Ā» Hostility towards a specific group or individual
Ā» Can be represented by some profanity terms, but terms do not imply hate
ā Insult
Ā» Expressions that can harm the prestige of individuals or group
Ā» Various profanity terms are included
ā Offensive expressions
Ā» Does not count as hate or insult, but may make the readers offensive
Ā» Includes sarcasm, irony, bad guessing, unethical expressions
20
22. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 2: What should we define first?
Hate speech as bias detection and toxicity measurement
ā¢ Set up criteria for toxicity measurement
Ā» Severe hate or insult
Ā» Not hateful but offensive or sarcastic
Ā» None
21
Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments, 2021.
23. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 3: What is required for the annotation?
Building a guideline for data annotation
ā¢ Stakeholders
ā Researchers
ā Moderators (crowdsourcing platform)
ā Workers
ā¢ How is guideline used as?
ā Setting up research direction (for researchers)
ā Task understanding (for moderators)
ā Data annotation (for workers)
22
24. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 3: What is required for the annotation?
Building a guideline for data annotation
ā¢ Guideline is not built at once!
ā Usual process
Ā» Making up draft guideline based on source corpus
Ā» Pilot study of researchers & guideline update (š times iteration)
Ā» Moderatorsā and researchersā alignment on the guideline
Ā» Worker recruitment & pilot tagging
Ā» Guideline update with worker feedback (cautions & exceptions)
Ā» Final guideline (for main annotation)
23
25. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 3: What is required for the annotation?
Building a guideline for data annotation
ā¢ Draft guideline
ā Built based upon a small portion of source corpus (about hundreds of instances)
ā Researchersā intuition is highly involved in
ā Concept-based description
Ā» e.g., for `biasā,
`People with a specific characteristic may behave in some wayā
(instead of listing up all stereotyped expressions)
ā¢ Pilot study
ā Researchersā tagging on slightly larger portion of source corpus (~1K instances)
ā Fitting researchersā intuition on the proposed concepts
Ā» e.g., ``Does this expression contain bias or toxicity?āā
(discussion is important, but donāt fight!)
ā Update descriptions or add examples
ā Labeling, re-labeling, re-re-labeling...
24
26. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 3: What is required for the annotation?
Building a guideline for data annotation
ā¢ Pilot study
ā Labeling, re-labeling, re-re-labeling... + Agreement?
ā Inter-annotator agreement (IAA)
Ā» Calculating the reliability of annotation
Ā» Cohenās Kappa for two annotators
Ā» Fleissā Kappa for more than two annotators
ā Sufficiently high agreement? (> 0.6?)
Ā» Letās go annotating in the wild!
25
Pustejovsky and Stubbs, Natural Language Annotation, 2012.
27. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā¢ Finding a crowdsourcing platform
ā Moderator
Ā» Usually an expert in data creation and management
Ā» Comprehends the task, gives feedback in view of workers
Ā» Helps communication between researchers and workers
Ā» Instructs, and sometimes hurries workers to meet the timeline
Ā» Manages financial or legal issues
Ā» Let researchers concentrate on the task itself
ā Without moderator?
Ā» Researchers are the moderator!
(Unless there are some automated functions in the platform)
ā With moderator?
Ā» The closest partner of researchers
26
28. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā¢ Finding a crowdsourcing platform
ā Existence and experience of the moderator
Ā» Experience of similar dataset construction
Ā» Comprehension of the task & proper feedbacks
Ā» Sufficient worker pool
Ā» Trust between the moderator and workers
ā Reasonable cost estimation
Ā» Appropriateness of price per tagging or reviewing
Ā» Appropriateness of worker compensation
Ā» Fit with the budget
27
29. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā¢ Finding a crowdsourcing platform
ā Usefulness of the platform UI
Ā» Progress status (In progress, Submitted, Waiting for reviewing... etc.)
Ā» Statistics: The number of workers and reviewers, Average work/review duration...
Ā» Demographics, Worker history by individuals & in total...
28
30. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā¢ Pilot tagging (by workers)
ā Goal of worker pilot
Ā» Guideline update in workersā view (especially on cautions & exceptions)
Ā» Worker selection
ā Procedure
Ā» Advertisement or recruitment
Ā» Worker tagging
Ā» Researchersā (or moderatorsā) review & rejection
Ā» Workersā revise & resubmit
29
31. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā¢ Details on worker selection process?
ā Human checking
Ā» Ethical standard not too far from the guideline?
Ā» Is feedback effective for the rejected samples?
ā Automatic checking
Ā» Enough taggings done?
Ā» Too frequent cases of skipping the annotation?
30
UI screenshots provided by Deep Natural AI.
32. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā¢ Crowdsourcing: A simplified version is required for crowd annotation!
ā Multi-class, multi-attribute tagging
Ā» 3 classes for bias
Ā» 3 classes for toxicity
ā Given a comment (without context), the annotator should tag each attribute
ā Detailed guideline (with examples, cautions, and exceptions) is provided separately
31
1. What kind of bias does the comment contain?
- Gender bias, Other biases, or None
2. Which is the adequate category for the comment in terms of toxicity?
- Hate, Offensive, or None
33. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā¢ Main annotation
ā Based on the final version of the guideline
Ā» 3~5 annotators (per sample) for usual classification tasks
ā Tagging done by selected workers
Ā» Worker selection and education
Ā» Short quiz (if workers are not selected)
ā Annotation toolkit
Ā» Assign samples randomly to workers, with multiple annotators per sample
Ā» Interface developed or provided by the platform (usually takes budget)
Ā» Open-source interfaces (e.g., Labelstudio)
ā Data check for further guarantee of quality
Ā» If sufficiently many annotators per sample?
Ā» If not...?
32
34. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā¢ Data selection after main annotation (8,000 samples)
ā Data reviewing strategy may differ by subtask
ā Researchers decide the final label after adjudication
ā Common for bias and toxicity
Ā» Cases with all three annotators differ
ā Only for toxicity
Ā» Since the problem regards the continuum of degree,
cases with only hate (o) and none (x) need to be investigated again
ā Failure for decision (unable to majority vote) ā discarded
33
35. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 4: How is the annotation process conducted and evaluated?
Worker pilot, crowdsourcing, and agreement
ā¢ Final decision
ā Test: 974
Ā» Data tagged while constructing the guideline
(Mostly adjust to the intention of the guideline)
ā Validation: 471
Ā» Data which went through tag/review/reject
and accept in the pilot phase,
done with a large number of annotators
(Roughly aligned with the guideline)
ā Train: 7,896
Ā» Data which were crowd-sourced with the
selected workers, not reviewed totally but
went through adjudication only for some special cases
ā¢ Agreement
ā 0.492 for bias detection, 0.496 for toxicity measurement
34
Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020.
36. Low-resource perspective
ā¢ Creating a hate speech corpus from scratch
ļ§ Step 4: How is the annotation process conducted and evaluated?
Beyond creation - Model training and deployment
ā¢ Model training
ā Traditionally
Ā» High performance ā relatively easy?
Ā» Low performance ā relatively challenging?
ā But in PLM-based training these days...
Ā» Pretraining corpora
Ā» Model size
Ā» Model architecture
ā Model deployment
Ā» Performance & size
Ā» User feedbacks
35
Yang, Transformer-based Korean Pretrained Language Models: A Survey on Three Years of Progress, 2021.
37. Challenges
ā¢ Challenges of hate speech corpus construction
ļ§ Context-dependency
ā¢ News comment ā articles
ā¢ Tweets ā threads
ā¢ Web community comments ā posts
ļ§ Multi-modal or noisy inputs
ā¢ Image and audio
ā Kiela et al. (2020)
- Hateful memes challenge
ā¢ Perturbated texts
ā Cho and Kim
(2021)
- Leetcodes
- Yaminjeongeum
36
Kiela et al., The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes, 2020.
Cho and Kim, Google-trickers, Yaminjeongeum, and Leetspeak: An Empirical Taxonomy for Intentionally Noisy User-Generated Text, 2021.
38. Challenges
ā¢ Challenges of hate speech corpus construction
ļ§ Categorical or binary output has limitation
ā¢ Limitation of categorizing the degree of intensity
ā Hate/offensive/none categorization is sub-optimal
ā Polleto et al. (2019)
Scale-based annotation:
Unbalanced Rating Scale
Ā» Used to determine the label
(or used as a target score?)
37
Poletto et al., Annotating Hate Speech: Three Schemes at Comparison, 2019.
39. Challenges
ā¢ Challenges of hate speech corpus construction
ļ§ Annotation requires multiple label
ā¢ Aspect of discrimination may differ by attributes
ā Gender, Race, Nationality, Ageism ...
ā¢ Tagging `all the target attributesā that appear?
ā Kang et al. (2022)
Ā» Detailed guideline with terms and concepts defined for each atttribute
38
Women
& family
Male Sexual
minorities
Race &
nationality
Ageism Regionalism Religion Other Malicious None
S1 1 0 0 0 1 0 0 0 0 0
S2 0 0 0 0 0 0 0 0 1 0
S3 0 0 0 1 0 0 1 0 0 0
S4 0 0 0 0 0 0 0 0 0 1
Kang et al., Korean Online Hate Speech Dataset for Multilabel Classification - How Can Social Science Improve Dataset on Hate Speech?, 2022.
40. Challenges
ā¢ Challenges of hate speech corpus construction
ļ§ Privacy and license issues
ā¢ Privacy and license can be violated with the text crawling
ā¢ Hate speech corpus may contain personal information on (public) figures
ā¢ Text could have been brought from elsewhere (copy & paste)
ļ§ How about creating hate (and non-hate) speech from scratch?
ā¢ Yang et al. (2022): Recruit workers and enable `anonymousā text generation!
39
Yang et al., APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets, 2022.
41. Challenges
ā¢ Ambiguity is inevitable
ļ§ Text may incorporate various ways of interpretation
ā¢ Text accompanies omission or replacement to trick the monitoring
ā¢ Intention is apparent considering the context
ā¢ Temporal diachronicity of hate speech
ļ§ Non-hate speech in the past can be interpreted as hate speech these days
ļ§ Diachronicity may deter the utility of prediction systems
ā¢ e.g., [a name of celebrity who commited crime] before 20xx / after 20xx
ā¢ Boundary of hate speech and freedom of speech
ļ§ Grey area that cannot be resolved
ā¢ Some readers are offended by false positives
ā¢ Some users are offended by false negatives
40
42. Conclusion
ā¢ Hate speech prevalent in real and cyber spaces
ļ§ Discussions on hate speech have diverse viewpoints, from academia to
society and industry ā and they are reflected to the dataset construction
ā¢ No corpus is built perfect from the beginning
ļ§ ... and hate speech is one of the most difficult kind of corpus to create
ā¢ Considerations in low-resource hate speech corpus construction
ļ§ Why? How? How much? How well?
ā¢ Still more challenges left
ļ§ Context, input noise, output format, indecisiveness ...
ā¢ Takeaways
ļ§ There is discrepancy between theoretical and practical definition of hate
speech, and their aim may differ
ļ§ There is no hate speech detection guideline that satisfies ALL, so letās find
the boundary that satisfies the most and improve it
41
43. Reference
ā¢ Waseem and Hovy, Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter, 2016.
ā¢ Davidson et al., Automated Hate Speech Detection and the Problem of Offensive Language, 2017.
ā¢ Sanguinetti et al., An Italian Twitter Corpus of Hate Speech against Immigrants, 2018.
ā¢ Assimakopoulos et al., Annotating for Hate Speech: The MaNeCo Corpus and Some Input from Critical Discourse Analysis, 2020.
ā¢ Moon et al., BEEP! Korean Corpus of Online News Comments for Toxic Speech Detection, 2020.
ā¢ Hong et al., Study on the State and Regulation of Hate Speech, 2016.
ā¢ Cho and Moon, How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News
Comments, 2021.
ā¢ Pustejovsky and Stubbs, Natural Language Annotation, 2012.
ā¢ Yang, Transformer-based Korean Pretrained Language Models: A Survey on Three Years of Progress, 2021.
ā¢ Kiela et al., The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes, 2020.
ā¢ Cho and Kim, Google-trickers, Yaminjeongeum, and Leetspeak: An Empirical Taxonomy for Intentionally Noisy User-Generated
Text, 2021.
ā¢ Poletto et al., Annotating Hate Speech: Three Schemes at Comparison, 2019.
ā¢ Kang et al., Korean Online Hate Speech Dataset for Multilabel Classification - How Can Social Science Improve Dataset on Hate
Speech?, 2022.
ā¢ Yang et al., APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets, 2022.
42