Understanding Online Socials Harm: Examples of Harassment and Radicalization

Amit Sheth
Amit ShethFounding Director, Artificial Intelligence Institute at University of South Carolina
Understanding Online Socials Harm:
Examples of Harassment and Radicalization
Prof. Amit Sheth
Founding Director, AI Institute
University of South Carolina
AI @
UofSC
33rd Annual IFIP WG 11.3 Conference on Data and Applications Security
and Privacy (DBSec'19)
Charleston, SC, USA
July 15 -17, 2019
Icons by thenounproject
Slides by SlideModel
2
The youngest adults stand out in their social media consumption
88% of 18- to 29-year-olds indicate that they use any form of social media.
By Pew Research Center “Social Media Use Report 2018”
3
Social Good and Social Harm on Social Media
A spectrum to demonstrate the variety of social good to social harm
Adapted from : Purohit, Hemant & Pandey, Rahul. (2019). Intent Mining for the Good, Bad, and Ugly Use of Social Web: Concepts, Methods, and
Challenges. 10.1007/978-3-319-94105-9_1.
Zika Virus
Monitoring
Help
Fighting
depression
Disaster
Relief
Opioid Usage
Monitoring
Joking
Marketing
Sensationalizing
Harassment
Accusing
Rumouring
Deceiving
Fake News
Radicalization
Illicit Drugs
Social HarmSocial Good
Positive Effects Negative Effects
4
Fake-porn videos are being weaponized to harass and
humiliate women: ‘Everybody is a potential target’
‘Deepfake’ disturbingly realistic, computer-generated videos with photos taken from
the Web, and ordinary women are suffering the damage
5
Different meanings of
diagnostic terms
Ambiguity
Different perceptions
of same concepts.
Subjectivity
Low prevalence of
relevant content
Sparsity
Nature of content with
more than one
context.
Multi
Dimensionality
Significant implications
in a big scale
application.
False Alarms
Knowledge Graphs and Knowledge Networks: The Story in Brief - IEEE Internet
Computing Magazine 2019
Multimodal
Content
Different modalities of
data
Challenges --Complex Problems
6
1.
Online Harassment
Context in an interaction determines bad behavior.
7
Severity of online harm can differ
based on several criteria
It can span for more than a decade in
one’s life
Or it can lead to teenage suicides
Police accuse two students, age
12, of cyberbullying in suicide
By Jamiel Lynch, CNN
Teenage cyber bullying victims are
Use-cases of Online Harassment
9
Existing Approaches
Tweet Content
Binary
Classifier
Harassing Non-Harassing
People
Network
Our Approach (Incorporating context)
Tweet Content
Multiclass
Classifier
Sexual
harassment Appearance
related
Harassment
Traditional Approach vs Recent Advances
10
Problem Definition & Sparsity
Dataset # of Tweets Classes (%)
Waseem et al.
(2016)
16093
Racism (12%),
Sexism (19.6%),
Neither (68.4%)
Davidson et al.
(2017)
24,802
Hate (5%), Non-
hate (95%)
Zhang et al.
(2018)
2,435
Hate (17%),
Non-hate (83%)
Mostly binary classifications
Datasets have small percentages of positive
(harassing) instances
A Quality Type-aware Annotated Corpus And
Lexicon For Harassment Research [Rezvan et
al.]
This paper provides both a quality annotated
corpus and an offensive words lexicon
capturing different types of harassment
content:
(i) sexual (7.4%)
(ii) racial (22.5%)
(iii) appearance-related (21.8%)
(iv) intellectual (26%)
(v) political (22.4%)
Harassing (12.9%)
Non-Harassing
(87.1%)
Challenges in Online Harassment
12
According to Pew research center (2017)
Subjectivity
“pEoPlE dO iT tO tHeMsElVeS”
shut the f**k up
you know what im not gonna
argue anymore u guys are all so
f**king ignorant when it comes to
addiction so PLEASe stop f**king
speaking on it
all the girls who i have beef w are
little ass girls who think they can
say the n word but get scared
when black guys come around 🤔 🤔
on that note, gn twit
An interaction example from
Highschool students’ tweet corpus
Challenges in Online Harassment
“Language used to express hatred
towards a targeted individual or group, or
is intended to be derogatory, to
humiliate, or to insult the members of the
group, on the basis of attributes such as
race, religion, ethnic origin, sexual
orientation, disability, or gender is hate
speech” - Founta et al. 2018
13
Challenges in Online Harassment Detection
Ambiguity
Researchers have defined harassment using jargon that overlaps, causing ambiguity in
annotations
“Profanity, strongly impolite, rude or
vulgar language expressed with fighting
or hurtful words in order to insult a
targeted individual or group is
offensive language ” - Founta et al. 2018
Ex. 1: @user_name nah you just a dumb
hoe who doesn’t know her place 😂 😂
This tweet belongs to hate speech and
offensive language based on above
definitions
Examples from Highschool students’
tweet corpus
Challenges in Online Harassment
Ex. 2: IS THAT A MICROAGRESSION AGAINST
MEXICANS BY STEREOTYPING THEM AS
ILLEGALS?!? only if you were vegan you wouldn’t
be such a racist pig
This tweet falls into the category of
hate speech but not necessarily
offensive language
14
What? How? Who?
What causes online
harassment? [REASON]
How can online
harassment happen?
[METHOD]
What are the effects of
online harassment?
[RESULT]
Victim
Appearance/
Religion/
Race etc.
Harasser
Xenophobia/
Homophobia/
Intolerance
etc.
Victim feels:
~ Offended
~ Discriminated
~ Afraid of losing life or
losing social capital
~ Depressed
~ Suicidal
Direct vs. Indirect
Where?
~ Flaming (the act of posting or
sending offensive messages over the
Internet)
~ Doxxing (broadcasting private
or identifying information of
individuals)
~ Dogpiling (several people in
twitter addressing someone, usually
negatively in a short period of time)
~ Impersonation
~ Public shaming
~ Threats
[ACTORS]
~ Harasser
~ Victim
~ Bystanders:
a. Aggravators
(People who try to
fuel a harassing
situation indirectly,
for example by
retweeting
harassing tweets)
a. Empathizers
(People who
empathize with the
victim, providing
support)
~ Social Media
(Twitter,
Facebook,
Instagram)
~ Discussion
Boards
(Reddit, 4chan)
~ Email
~ Private
Messaging
~ Online
Gaming
* Policies of
platforms
[PLATFORM]
Frequently vs. One-
time
Cyber Bullying vs. Cyber
Aggression
Online Harassment - Dimensions
16
Resolving Data Scarcity Issue
Using Generative adversarial networks (GANs) to generate text,
increasing the positive(harassing) examples in a dataset
Changing the generator objective function in the GAN to incorporate
domain specific knowledge
Multiclass classification of harassing tweets
Harassment type prediction was done using a multiclass classifier
The process of tweet vectorization leveraged domain knowledge in the
form of an offensive words lexicon
Current Research Directions Pursued
17
Key Takeaways
● In spite of recognition of importance, the problem is
still not well-understood, and not well-defined.
● Use cases and data show that the problem is far more
complex and nuanced. Understanding context is
critical.
● Current social media platforms appear to have little or
no automated processes for detection and prevention.
Oversimplification of problem definition largely
relying on machine learning without significant domain
specific knowledge has rendered solutions practically
useless.
18
2.
Online Extremist
Communications
Islamist Extremism and White Supremacy
19
Challenges in Online Harassment Detection
Efforts by High-Tech Companies
Capabilities of social media
companies (Twitter, Facebook
and Google) are inadequate and
ineffective.
Governments insisted that the
industry had a ‘social
responsibility’ to do more to
remove harmful content.
If unsolved, social media
platforms will continue to
negatively impact the society.
Unsolved: Detection of Extremism on Social Media
20
Challenges in Online Harassment Detection
● One thousand Americans
between 1980 and 2011.
300 Americans since 2011
attempted or traveled.
● > 5 thousand individuals from
Europe traveled to Join
Extremist Terrorist Groups
(ISIS, Al-Qaeda) abroad
through 2015,
● Most inspired and persuaded
online.
“The Travelers”
*George Washington University, Program on Extremism
● 24 year old college student from Alabama became
radicalized on Twitter. After a year, moved to Syria
to join ISIS.
● Self-taught, she read verses from the Qur’an, but
interpreted them with others in the extremist
network.
● Persuaded that when the true Islamic State is
declared, it is obligatory to do hijrah, which they
see as the pilgrimage to ’the State’.
21
Illustrative Case
*New York Times: “Alabama Woman Who Joined ISIS Can’t Return Home, U.S. Says”
22
Challenges in Online Harassment DetectionRadicalization Scale (Achilov et al.)
0
None
Mainstream
religious
views and
orientations
Indicator:
Islam; Allah;
jihad (self
struggle);
halal;
democracy,
islam, salah,
fatwa, hajj.
1
Low
Attitudinal
support for
politically
moderate
Islamism
Indicator:
Hadith;
Caliphate
(Khilafah)
justified;
Sharia better
(than secular
law);
Hypocrisy
west.
2
Elevated
Emergent
support for
exclusive rule
of the Shari’a
law
Indicator:
Shariah best;
revenge
(justified);
jihad (against
West); justify
Daesh (ISIS)
3
High
Support for
extremist
networks and
travel to
“Darul Islam”
Indicator:
Kafir; infidel;
hijrah to Darul-
Islam;
(supporting)
fatwa Al-
Awlaki;
mushrikeen.
4
Severe
Call for action
to join the
fight and the
use of
violence.
Indicator:
apostate;
sahwat;
taghut; kill;
kafir; kuffar;
murtadd;
tawaghit;
al_baghdadi;
martyrdom
khilafah
23
Challenges in Online Harassment DetectionRadicalization Process over time
Ultimately, analysis of content in context will provide better
finer-granular understanding the underlying factors in the
radicalization process.
Non-extremist
ordinary
individual
Radicalized
extremist
individual
0 1 2 4
SevereHighLowNon
e
Elevated
3
Islamist Extremism on Social Media
(e.g., recruiter, follower) with
respect to different stages of
radicalization.
Modeling users
psychological process
over a time period.
Persuasive
relevant to Islamist
extremism.
Domain Knowledge
of the context (“jihad” has
different meaning in
different context)
Multidimensionality
Radicalization
Security Implications
Specifically, unfair
classification of non-
extremist individuals as
extremist.
False alarm might potentially
impact millions of innocent
people.
25
Local and Global security implications,
while predicting online terrorist activities
and involved individuals.
26
Multidimensionality of Extremist Content
● Dimensions to define the context:
○ Based on literature and our empirical study of the
data, three contextual dimensions are identified:
Religion, Ideology, Hate
● The distribution of prevalent terms (i.e., words, phrases,
concepts) in each dimension is different.
● These terms should be represented in different
dimensions, to disambiguate especially diagnostic
terms (e.g., jihad): .
Extremist Content
27
Prevalent Key Phrases Prevalent Topics
isis, syria, kill, iraq, muslim, allah, attack, break,
aleppo, assad, islamicstate, army, soldier,
cynthiastruth, islam, support, mosul, libya, rebel,
destroy, airstrike
Caliphate_news, islamic_state, iraq_army, soldier_kill,
iraqi_army, syria_isis, syria_iraq, assad_army,
terror_group, shia_militia, isis_attack, aleppo_syria,
martyrdom_operation, ahrar_sham, assad_regime,
follow_support, lead_coalition, turkey_army, isis_claim,
kill_isis
Imam_anwar_awlaki, video_message_islamicstate,
fight_islamic_state, isisclaim_responsibility_attack,
muwahideen_powerful_middleeast, isis_tikrit_tikritop,
amaqagency_islamicstate_fighter,
sinai_explosion_target, alone_state_fighter,
intelligence_reportedly_kill,
khilafahnew_islamic_state,
yemanqaida_commander_kill, isis_militant_hasakah,
breakingnew_assad_army, isis_explode_middle,
hater_trier_haleemah, trust_isis_tighten,
qamishlus_isis_fighting, defeat_enemy_allah,
kill_terrorist_baby, ahrar_sham_leader
islamic state, syria, isis, kill, allah, video, minute
propaganda video scenes, jaish islam release, restock
missile, kaffir, join isis, aftermath, mercy, martyrdom
operation syrian opposition, punish libya isis, syria
assad, islam sunni, swat, lose head, wilayatalfurat,
somali, child kill, takfir, jaish fateh, baghdad, iraq,
kashmir muslim, capture, damascus, report rebel,
british, qala moon, jannat, isis capture, border cross,
aleppo, iranian soldier, tikrit tikrittop, lead shia military
kill, saleh abdeslam refuse cooperate
Green: Religion
Blue: Ideology
Red: Hate
Corpus: 538 Twiter verified extremists, 48K tweets
28
“Reportedly, a number of
apostates were killed in
the process. Just
because they like it I
guess.. #SpringJihad
#CountrysideCleanup”
“Kindness is a language
which the blind can see
and the deaf can hear
#MyJihad be kind
always”
“By the Lord of Muhammad (blessings and peace be upon
him) The nation of Jihad and martyrdom can never be
defeated”
“Jihad” can appear in tweets with different meanings in different dimensions
of the context.
H
I
R
Example Tweets with “Jihad”
29
Challenges in Online Harassment Detection
● Same term can have
different meanings for each
dimensions.
● Example:
“Meaning of Jihad”
is different for extremists
and non-extremists.
○ For extremists, meaning closer
to “awlaki”, “islamic state”,
“aqeedah”
For non-extremists, closer to
“muslims”, “quran”, “imams”
Ambiguity of Diagnostic terms/phrases
ExtremistsNon-Extremists
● Different Contextual Dimensions
incorporating:
○ Knowledge Graphs
○ Dimension Corpora
● Utilization of Deep Learning models,
generate knowledge-enhanced
representations
● KG creation:
Religion: Qur’an, Hadith
Ideology: Books, lectures of
ideologues
[Not KG: Hate: Hate Speech Corpuss
(Davidson et al. 2017)]
● Can be applied over many social
problems.
30
Modeling
Modeling
Modeling
Dimension 1
Dimension 2
Dimension 3
DimensionDimensionDimension
Dimension Modeling
Process
Dimension based
Knowledge
enhanced
Representation
Contextual Dimension Modeling
(Hate)
Capturing similarity:
● Learning word similarities from a substantial knowledge
graph
● A solution via distance between concepts in the knowledge
graph.
Modeling
31
Using a Knowledge Graph
“You shall know a word by the company it keeps” (J. R. Firth 1957: 11)
Capturing similarity (and resolving ambiguity):
● Learning word similarities from a large corpora.
● A solution via distributional similarity-based
representations.
Modeling
32
(Hate)
Using a Corpus
“You shall know a word by the company it keeps” (J. R. Firth 1957: 11)
33
● Found two distinct groups
employing different contexts with
different density.
● Religion and Hate are usually
mixed, suggesting that extremists
might employ different hate
tactics.
● A small group of users employ
ideological context far more often
than others, suggesting these
users might be disseminators of
ideologically intense content.
Density of Dimensions in Extremist Content
34
● Tri-dimension model
performs best.
● Precision used as
metric, to emphasize
reduction on
misclassification of
non-extremist
content.
● Implications in a large
scale application.
Results
● False alarms: significantly reduced via incorporation of three
specific dimensions of context.
● Extremist users employ religion along with hate, suggesting they
employ different hate tactics for their targets.
● Inclusion of all three contextual dimensions significantly reduces
the likelihood of an unfair mistreatment towards non-extremist
individuals, in a real world application.
● Each dimension plays different roles in different levels of
radicalization, capturing nuances as well as linguistic and
semantic cues better throughout the radicalization process.
35
Key Insights
36
Public/
Society
Social
Interactions
Cognitive
Neuro
Cognitive
Process
● Human brain processes information
from extremist narratives on social
media, that includes different
contexts, emotions, sentiment, etc.
● Individuals change behavior, make
choices in consuming/sharing
content with an intent.
● Coordination, information flow and
diffusion on social networks.
● Outcomes/impact on society through
events and collective actions (eg,
civil war or result of an election).
Our Highly Multidisciplinary Approach
37
Weaponized Ambiguity
Sparsity Complexity
38
Context-Aware Harassment Detection on Social Media(wiki link)
is an interdisciplinary project among the Ohio Center of
Excellence in Knowledge-enabled Computing (Kno.e.sis), the
Department of Psychology, and Center for Urban and Public
Affairs (CUPA) at Wright State University.
We are supported by the NSF Award#: CNS 1513721
Supporting Grants
39
Thank You!
Special Thanks: Ugur Kursuncu and Thilini Wijesiriwardene
1. Hinduja, S. and Patchin, J.W., 2010. Bullying, cyberbullying, and suicide. Archives of suicide research,
14(3), pp.206-221
2. Rezvan, M., Shekarpour, S., Balasuriya, L., Thirunarayan, K., Shalin, V.L. and Sheth, A., 2018, May. A
Quality Type-aware Annotated Corpus and Lexicon for Harassment Research. In Proceedings of the
10th ACM Conference on Web Science (pp. 33-36). ACM.
3. Zeerak Waseem. Are you a racist or am i seeing things? annotator influence on hate speech detection
on twitter. In Proc. of the Workshop on NLP and Computational Social Science, pages 138–142.
Association for Computational Linguistics, 2016.
4. Thoams Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. Automated hate speech
detection and the problem of offensive language. In Proceedings of the 11th Conference on Web and
Social Media. AAAI, 2017.
5. Zhang, Ziqi & Luo, Lei. (2018). Hate Speech Detection: A Solved Problem? The Challenging Case of
Long Tail on Twitter. Semantic Web. Accepted. 10.3233/SW-180338.
40
References
1 of 37

Recommended

Computational Social Science as the Ultimate Web Intelligence by
Computational Social Science  as the Ultimate Web IntelligenceComputational Social Science  as the Ultimate Web Intelligence
Computational Social Science as the Ultimate Web IntelligenceAmit Sheth
520 views15 slides
AIISC’s research in Social Good/Social Harm, Public Health and Epidemiology by
AIISC’s research in  Social Good/Social Harm,  Public Health and EpidemiologyAIISC’s research in  Social Good/Social Harm,  Public Health and Epidemiology
AIISC’s research in Social Good/Social Harm, Public Health and EpidemiologyArtificial Intelligence Institute at UofSC
275 views55 slides
Transforming Social Big Data into Timely Decisions and Actions for Crisis Mi... by
Transforming Social Big Data into Timely Decisions  and Actions for Crisis Mi...Transforming Social Big Data into Timely Decisions  and Actions for Crisis Mi...
Transforming Social Big Data into Timely Decisions and Actions for Crisis Mi...Amit Sheth
985 views62 slides
Twitris in Action - a review of its many applications by
Twitris in Action - a review of its many applications Twitris in Action - a review of its many applications
Twitris in Action - a review of its many applications Amit Sheth
121 views32 slides
Exploring digital fake news phenomenon in indonesia cpr south_short_pdf by
Exploring digital fake news phenomenon in indonesia cpr south_short_pdfExploring digital fake news phenomenon in indonesia cpr south_short_pdf
Exploring digital fake news phenomenon in indonesia cpr south_short_pdfRiri Kusumarani
639 views23 slides
Emergency relief services in the social media age by
Emergency relief services in the social media ageEmergency relief services in the social media age
Emergency relief services in the social media ageEvanMeduna
53 views18 slides

More Related Content

What's hot

Public Health Crisis Analytics for Gender Violence by
Public Health Crisis Analytics for Gender ViolencePublic Health Crisis Analytics for Gender Violence
Public Health Crisis Analytics for Gender ViolenceHemant Purohit
6.9K views20 slides
Semantic Social Mashup approach for Designing Citizen Diplomacy by
Semantic Social Mashup approach for Designing Citizen DiplomacySemantic Social Mashup approach for Designing Citizen Diplomacy
Semantic Social Mashup approach for Designing Citizen DiplomacyAmit Sheth
516 views3 slides
1112 social media and public health by
1112 social media and public health1112 social media and public health
1112 social media and public healthMélodie YunJu Song
75 views33 slides
LIS 60030 Final Project by
LIS 60030 Final ProjectLIS 60030 Final Project
LIS 60030 Final ProjectLaura Levy
77 views8 slides
Targeted disinformation warfare how and why foreign efforts are by
Targeted disinformation warfare  how and why foreign efforts areTargeted disinformation warfare  how and why foreign efforts are
Targeted disinformation warfare how and why foreign efforts arearchiejones4
26 views37 slides
The case for integrating crisis response with social media by
The case for integrating crisis response with social media The case for integrating crisis response with social media
The case for integrating crisis response with social media American Red Cross
4.8K views32 slides

What's hot(20)

Public Health Crisis Analytics for Gender Violence by Hemant Purohit
Public Health Crisis Analytics for Gender ViolencePublic Health Crisis Analytics for Gender Violence
Public Health Crisis Analytics for Gender Violence
Hemant Purohit6.9K views
Semantic Social Mashup approach for Designing Citizen Diplomacy by Amit Sheth
Semantic Social Mashup approach for Designing Citizen DiplomacySemantic Social Mashup approach for Designing Citizen Diplomacy
Semantic Social Mashup approach for Designing Citizen Diplomacy
Amit Sheth516 views
LIS 60030 Final Project by Laura Levy
LIS 60030 Final ProjectLIS 60030 Final Project
LIS 60030 Final Project
Laura Levy77 views
Targeted disinformation warfare how and why foreign efforts are by archiejones4
Targeted disinformation warfare  how and why foreign efforts areTargeted disinformation warfare  how and why foreign efforts are
Targeted disinformation warfare how and why foreign efforts are
archiejones426 views
The case for integrating crisis response with social media by American Red Cross
The case for integrating crisis response with social media The case for integrating crisis response with social media
The case for integrating crisis response with social media
American Red Cross4.8K views
Helping Crisis Responders Find the Informative Needle in the Tweet Haystack by COMRADES project
Helping Crisis Responders Find the Informative Needle in the Tweet HaystackHelping Crisis Responders Find the Informative Needle in the Tweet Haystack
Helping Crisis Responders Find the Informative Needle in the Tweet Haystack
COMRADES project102 views
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ... by Daniel McLinden
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
Daniel McLinden761 views
Information disorder: Toward an interdisciplinary framework for research and ... by friendscb
Information disorder: Toward an interdisciplinary framework for research and ...Information disorder: Toward an interdisciplinary framework for research and ...
Information disorder: Toward an interdisciplinary framework for research and ...
friendscb991 views
Humanitarian Diplomacy in the Digital Age: Analysis and use of digital inform... by Keith Powell
Humanitarian Diplomacy in the Digital Age: Analysis and use of digital inform...Humanitarian Diplomacy in the Digital Age: Analysis and use of digital inform...
Humanitarian Diplomacy in the Digital Age: Analysis and use of digital inform...
Keith Powell773 views
Disaster data informatics for situation awareness by Ashutosh Jadhav
Disaster data informatics for situation awareness Disaster data informatics for situation awareness
Disaster data informatics for situation awareness
Ashutosh Jadhav1.4K views
Vision track october_2020_fernandez_v5 by Miriam Fernandez
Vision track october_2020_fernandez_v5Vision track october_2020_fernandez_v5
Vision track october_2020_fernandez_v5
Miriam Fernandez334 views
disinformation risk management: leveraging cyber security best practices to s... by Sara-Jayne Terp
disinformation risk management: leveraging cyber security best practices to s...disinformation risk management: leveraging cyber security best practices to s...
disinformation risk management: leveraging cyber security best practices to s...
Sara-Jayne Terp315 views
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ... by Daniel McLinden
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
And Then the Internet Happened Prospective Thoughts about Concept Mapping in ...
Daniel McLinden73 views
Big Data Analysis and Terrorism by Amanda Tapp
Big Data Analysis and TerrorismBig Data Analysis and Terrorism
Big Data Analysis and Terrorism
Amanda Tapp502 views
A Systematic Survey on Detection of Extremism in Social Media by RSIS International
A Systematic Survey on Detection of Extremism in Social MediaA Systematic Survey on Detection of Extremism in Social Media
A Systematic Survey on Detection of Extremism in Social Media
RSIS International107 views
A Communicator's Guide to COVID-19 Vaccination by Sarah Jackson
A Communicator's Guide to COVID-19 VaccinationA Communicator's Guide to COVID-19 Vaccination
A Communicator's Guide to COVID-19 Vaccination
Sarah Jackson8.6K views

Similar to Understanding Online Socials Harm: Examples of Harassment and Radicalization

Cyberbullying jaffe ryan by
Cyberbullying jaffe ryanCyberbullying jaffe ryan
Cyberbullying jaffe ryanshjaffe
307 views18 slides
Glass RM Spring 2016 Final by
Glass RM Spring 2016 FinalGlass RM Spring 2016 Final
Glass RM Spring 2016 FinalElizabeth Glass
71 views26 slides
ASA style sample by
ASA style sampleASA style sample
ASA style sampleMarie Fincher
530 views6 slides
Respond to 2 students and professor. 150 words each student no word by
Respond to 2 students and professor. 150 words each student no word Respond to 2 students and professor. 150 words each student no word
Respond to 2 students and professor. 150 words each student no word mickietanger
3 views6 slides
Media Aggression And Aggressive Behavior Essay by
Media Aggression And Aggressive Behavior EssayMedia Aggression And Aggressive Behavior Essay
Media Aggression And Aggressive Behavior EssayAngela Williams
2 views77 slides

Similar to Understanding Online Socials Harm: Examples of Harassment and Radicalization(20)

Cyberbullying jaffe ryan by shjaffe
Cyberbullying jaffe ryanCyberbullying jaffe ryan
Cyberbullying jaffe ryan
shjaffe307 views
Respond to 2 students and professor. 150 words each student no word by mickietanger
Respond to 2 students and professor. 150 words each student no word Respond to 2 students and professor. 150 words each student no word
Respond to 2 students and professor. 150 words each student no word
mickietanger3 views
Media Aggression And Aggressive Behavior Essay by Angela Williams
Media Aggression And Aggressive Behavior EssayMedia Aggression And Aggressive Behavior Essay
Media Aggression And Aggressive Behavior Essay
Angela Williams2 views
Cyberbullying and Hate Speech by Brandwatch
Cyberbullying and Hate SpeechCyberbullying and Hate Speech
Cyberbullying and Hate Speech
Brandwatch1.8K views
lis 3201 Final presentation by Monte VanDyke
lis 3201 Final presentationlis 3201 Final presentation
lis 3201 Final presentation
Monte VanDyke500 views
Web Utopia Lost: Where Do We Go From Here by Shireen Mitchell
Web Utopia Lost: Where Do We Go From HereWeb Utopia Lost: Where Do We Go From Here
Web Utopia Lost: Where Do We Go From Here
Shireen Mitchell69 views
The Bystander Effect On Children Essay by Amanda Reed
The Bystander Effect On Children EssayThe Bystander Effect On Children Essay
The Bystander Effect On Children Essay
Amanda Reed2 views
Migratory Implications Of Media On Interracial Relationships by Leslie Lee
Migratory Implications Of Media On Interracial RelationshipsMigratory Implications Of Media On Interracial Relationships
Migratory Implications Of Media On Interracial Relationships
Leslie Lee2 views
Adolescent and Young Adult Social Media Use: Using For a Purpose, but Resulti... by samhauck
Adolescent and Young Adult Social Media Use: Using For a Purpose, but Resulti...Adolescent and Young Adult Social Media Use: Using For a Purpose, but Resulti...
Adolescent and Young Adult Social Media Use: Using For a Purpose, but Resulti...
samhauck46 views
World Hunger Is A Serious Problem by Susan Tullis
World Hunger Is A Serious ProblemWorld Hunger Is A Serious Problem
World Hunger Is A Serious Problem
Susan Tullis3 views
Defense Against The Digital Dark Arts: Navigating Online Spaces as a Journali... by Michelle Ferrier
Defense Against The Digital Dark Arts: Navigating Online Spaces as a Journali...Defense Against The Digital Dark Arts: Navigating Online Spaces as a Journali...
Defense Against The Digital Dark Arts: Navigating Online Spaces as a Journali...
Michelle Ferrier89 views

Recently uploaded

SOCO 9.pdf by
SOCO 9.pdfSOCO 9.pdf
SOCO 9.pdfSocioCosmos
6 views1 slide
Soco 7.pdf by
Soco 7.pdfSoco 7.pdf
Soco 7.pdfSocioCosmos
8 views1 slide
The Playing cards.pptx by
The Playing cards.pptxThe Playing cards.pptx
The Playing cards.pptxdivyabhana2
25 views5 slides
sOCO 9.pdf by
sOCO 9.pdfsOCO 9.pdf
sOCO 9.pdfSocioCosmos
5 views1 slide
SOCO 8.pdf by
SOCO 8.pdfSOCO 8.pdf
SOCO 8.pdfSocioCosmos
6 views1 slide
Soco 11 (2).pdf by
Soco 11 (2).pdfSoco 11 (2).pdf
Soco 11 (2).pdfSocioCosmos
6 views1 slide

Recently uploaded(8)

Understanding Online Socials Harm: Examples of Harassment and Radicalization

  • 1. Understanding Online Socials Harm: Examples of Harassment and Radicalization Prof. Amit Sheth Founding Director, AI Institute University of South Carolina AI @ UofSC 33rd Annual IFIP WG 11.3 Conference on Data and Applications Security and Privacy (DBSec'19) Charleston, SC, USA July 15 -17, 2019 Icons by thenounproject Slides by SlideModel
  • 2. 2 The youngest adults stand out in their social media consumption 88% of 18- to 29-year-olds indicate that they use any form of social media. By Pew Research Center “Social Media Use Report 2018”
  • 3. 3 Social Good and Social Harm on Social Media A spectrum to demonstrate the variety of social good to social harm Adapted from : Purohit, Hemant & Pandey, Rahul. (2019). Intent Mining for the Good, Bad, and Ugly Use of Social Web: Concepts, Methods, and Challenges. 10.1007/978-3-319-94105-9_1. Zika Virus Monitoring Help Fighting depression Disaster Relief Opioid Usage Monitoring Joking Marketing Sensationalizing Harassment Accusing Rumouring Deceiving Fake News Radicalization Illicit Drugs Social HarmSocial Good Positive Effects Negative Effects
  • 4. 4 Fake-porn videos are being weaponized to harass and humiliate women: ‘Everybody is a potential target’ ‘Deepfake’ disturbingly realistic, computer-generated videos with photos taken from the Web, and ordinary women are suffering the damage
  • 5. 5 Different meanings of diagnostic terms Ambiguity Different perceptions of same concepts. Subjectivity Low prevalence of relevant content Sparsity Nature of content with more than one context. Multi Dimensionality Significant implications in a big scale application. False Alarms Knowledge Graphs and Knowledge Networks: The Story in Brief - IEEE Internet Computing Magazine 2019 Multimodal Content Different modalities of data Challenges --Complex Problems
  • 6. 6 1. Online Harassment Context in an interaction determines bad behavior.
  • 7. 7 Severity of online harm can differ based on several criteria It can span for more than a decade in one’s life Or it can lead to teenage suicides Police accuse two students, age 12, of cyberbullying in suicide By Jamiel Lynch, CNN Teenage cyber bullying victims are Use-cases of Online Harassment
  • 8. 9 Existing Approaches Tweet Content Binary Classifier Harassing Non-Harassing People Network Our Approach (Incorporating context) Tweet Content Multiclass Classifier Sexual harassment Appearance related Harassment Traditional Approach vs Recent Advances
  • 9. 10 Problem Definition & Sparsity Dataset # of Tweets Classes (%) Waseem et al. (2016) 16093 Racism (12%), Sexism (19.6%), Neither (68.4%) Davidson et al. (2017) 24,802 Hate (5%), Non- hate (95%) Zhang et al. (2018) 2,435 Hate (17%), Non-hate (83%) Mostly binary classifications Datasets have small percentages of positive (harassing) instances A Quality Type-aware Annotated Corpus And Lexicon For Harassment Research [Rezvan et al.] This paper provides both a quality annotated corpus and an offensive words lexicon capturing different types of harassment content: (i) sexual (7.4%) (ii) racial (22.5%) (iii) appearance-related (21.8%) (iv) intellectual (26%) (v) political (22.4%) Harassing (12.9%) Non-Harassing (87.1%) Challenges in Online Harassment
  • 10. 12 According to Pew research center (2017) Subjectivity “pEoPlE dO iT tO tHeMsElVeS” shut the f**k up you know what im not gonna argue anymore u guys are all so f**king ignorant when it comes to addiction so PLEASe stop f**king speaking on it all the girls who i have beef w are little ass girls who think they can say the n word but get scared when black guys come around 🤔 🤔 on that note, gn twit An interaction example from Highschool students’ tweet corpus Challenges in Online Harassment
  • 11. “Language used to express hatred towards a targeted individual or group, or is intended to be derogatory, to humiliate, or to insult the members of the group, on the basis of attributes such as race, religion, ethnic origin, sexual orientation, disability, or gender is hate speech” - Founta et al. 2018 13 Challenges in Online Harassment Detection Ambiguity Researchers have defined harassment using jargon that overlaps, causing ambiguity in annotations “Profanity, strongly impolite, rude or vulgar language expressed with fighting or hurtful words in order to insult a targeted individual or group is offensive language ” - Founta et al. 2018 Ex. 1: @user_name nah you just a dumb hoe who doesn’t know her place 😂 😂 This tweet belongs to hate speech and offensive language based on above definitions Examples from Highschool students’ tweet corpus Challenges in Online Harassment Ex. 2: IS THAT A MICROAGRESSION AGAINST MEXICANS BY STEREOTYPING THEM AS ILLEGALS?!? only if you were vegan you wouldn’t be such a racist pig This tweet falls into the category of hate speech but not necessarily offensive language
  • 12. 14 What? How? Who? What causes online harassment? [REASON] How can online harassment happen? [METHOD] What are the effects of online harassment? [RESULT] Victim Appearance/ Religion/ Race etc. Harasser Xenophobia/ Homophobia/ Intolerance etc. Victim feels: ~ Offended ~ Discriminated ~ Afraid of losing life or losing social capital ~ Depressed ~ Suicidal Direct vs. Indirect Where? ~ Flaming (the act of posting or sending offensive messages over the Internet) ~ Doxxing (broadcasting private or identifying information of individuals) ~ Dogpiling (several people in twitter addressing someone, usually negatively in a short period of time) ~ Impersonation ~ Public shaming ~ Threats [ACTORS] ~ Harasser ~ Victim ~ Bystanders: a. Aggravators (People who try to fuel a harassing situation indirectly, for example by retweeting harassing tweets) a. Empathizers (People who empathize with the victim, providing support) ~ Social Media (Twitter, Facebook, Instagram) ~ Discussion Boards (Reddit, 4chan) ~ Email ~ Private Messaging ~ Online Gaming * Policies of platforms [PLATFORM] Frequently vs. One- time Cyber Bullying vs. Cyber Aggression Online Harassment - Dimensions
  • 13. 16 Resolving Data Scarcity Issue Using Generative adversarial networks (GANs) to generate text, increasing the positive(harassing) examples in a dataset Changing the generator objective function in the GAN to incorporate domain specific knowledge Multiclass classification of harassing tweets Harassment type prediction was done using a multiclass classifier The process of tweet vectorization leveraged domain knowledge in the form of an offensive words lexicon Current Research Directions Pursued
  • 14. 17 Key Takeaways ● In spite of recognition of importance, the problem is still not well-understood, and not well-defined. ● Use cases and data show that the problem is far more complex and nuanced. Understanding context is critical. ● Current social media platforms appear to have little or no automated processes for detection and prevention. Oversimplification of problem definition largely relying on machine learning without significant domain specific knowledge has rendered solutions practically useless.
  • 16. 19 Challenges in Online Harassment Detection Efforts by High-Tech Companies Capabilities of social media companies (Twitter, Facebook and Google) are inadequate and ineffective. Governments insisted that the industry had a ‘social responsibility’ to do more to remove harmful content. If unsolved, social media platforms will continue to negatively impact the society. Unsolved: Detection of Extremism on Social Media
  • 17. 20 Challenges in Online Harassment Detection ● One thousand Americans between 1980 and 2011. 300 Americans since 2011 attempted or traveled. ● > 5 thousand individuals from Europe traveled to Join Extremist Terrorist Groups (ISIS, Al-Qaeda) abroad through 2015, ● Most inspired and persuaded online. “The Travelers” *George Washington University, Program on Extremism
  • 18. ● 24 year old college student from Alabama became radicalized on Twitter. After a year, moved to Syria to join ISIS. ● Self-taught, she read verses from the Qur’an, but interpreted them with others in the extremist network. ● Persuaded that when the true Islamic State is declared, it is obligatory to do hijrah, which they see as the pilgrimage to ’the State’. 21 Illustrative Case *New York Times: “Alabama Woman Who Joined ISIS Can’t Return Home, U.S. Says”
  • 19. 22 Challenges in Online Harassment DetectionRadicalization Scale (Achilov et al.) 0 None Mainstream religious views and orientations Indicator: Islam; Allah; jihad (self struggle); halal; democracy, islam, salah, fatwa, hajj. 1 Low Attitudinal support for politically moderate Islamism Indicator: Hadith; Caliphate (Khilafah) justified; Sharia better (than secular law); Hypocrisy west. 2 Elevated Emergent support for exclusive rule of the Shari’a law Indicator: Shariah best; revenge (justified); jihad (against West); justify Daesh (ISIS) 3 High Support for extremist networks and travel to “Darul Islam” Indicator: Kafir; infidel; hijrah to Darul- Islam; (supporting) fatwa Al- Awlaki; mushrikeen. 4 Severe Call for action to join the fight and the use of violence. Indicator: apostate; sahwat; taghut; kill; kafir; kuffar; murtadd; tawaghit; al_baghdadi; martyrdom khilafah
  • 20. 23 Challenges in Online Harassment DetectionRadicalization Process over time Ultimately, analysis of content in context will provide better finer-granular understanding the underlying factors in the radicalization process. Non-extremist ordinary individual Radicalized extremist individual 0 1 2 4 SevereHighLowNon e Elevated 3
  • 21. Islamist Extremism on Social Media (e.g., recruiter, follower) with respect to different stages of radicalization. Modeling users psychological process over a time period. Persuasive relevant to Islamist extremism. Domain Knowledge of the context (“jihad” has different meaning in different context) Multidimensionality Radicalization
  • 22. Security Implications Specifically, unfair classification of non- extremist individuals as extremist. False alarm might potentially impact millions of innocent people. 25 Local and Global security implications, while predicting online terrorist activities and involved individuals.
  • 23. 26 Multidimensionality of Extremist Content ● Dimensions to define the context: ○ Based on literature and our empirical study of the data, three contextual dimensions are identified: Religion, Ideology, Hate ● The distribution of prevalent terms (i.e., words, phrases, concepts) in each dimension is different. ● These terms should be represented in different dimensions, to disambiguate especially diagnostic terms (e.g., jihad): .
  • 24. Extremist Content 27 Prevalent Key Phrases Prevalent Topics isis, syria, kill, iraq, muslim, allah, attack, break, aleppo, assad, islamicstate, army, soldier, cynthiastruth, islam, support, mosul, libya, rebel, destroy, airstrike Caliphate_news, islamic_state, iraq_army, soldier_kill, iraqi_army, syria_isis, syria_iraq, assad_army, terror_group, shia_militia, isis_attack, aleppo_syria, martyrdom_operation, ahrar_sham, assad_regime, follow_support, lead_coalition, turkey_army, isis_claim, kill_isis Imam_anwar_awlaki, video_message_islamicstate, fight_islamic_state, isisclaim_responsibility_attack, muwahideen_powerful_middleeast, isis_tikrit_tikritop, amaqagency_islamicstate_fighter, sinai_explosion_target, alone_state_fighter, intelligence_reportedly_kill, khilafahnew_islamic_state, yemanqaida_commander_kill, isis_militant_hasakah, breakingnew_assad_army, isis_explode_middle, hater_trier_haleemah, trust_isis_tighten, qamishlus_isis_fighting, defeat_enemy_allah, kill_terrorist_baby, ahrar_sham_leader islamic state, syria, isis, kill, allah, video, minute propaganda video scenes, jaish islam release, restock missile, kaffir, join isis, aftermath, mercy, martyrdom operation syrian opposition, punish libya isis, syria assad, islam sunni, swat, lose head, wilayatalfurat, somali, child kill, takfir, jaish fateh, baghdad, iraq, kashmir muslim, capture, damascus, report rebel, british, qala moon, jannat, isis capture, border cross, aleppo, iranian soldier, tikrit tikrittop, lead shia military kill, saleh abdeslam refuse cooperate Green: Religion Blue: Ideology Red: Hate Corpus: 538 Twiter verified extremists, 48K tweets
  • 25. 28 “Reportedly, a number of apostates were killed in the process. Just because they like it I guess.. #SpringJihad #CountrysideCleanup” “Kindness is a language which the blind can see and the deaf can hear #MyJihad be kind always” “By the Lord of Muhammad (blessings and peace be upon him) The nation of Jihad and martyrdom can never be defeated” “Jihad” can appear in tweets with different meanings in different dimensions of the context. H I R Example Tweets with “Jihad”
  • 26. 29 Challenges in Online Harassment Detection ● Same term can have different meanings for each dimensions. ● Example: “Meaning of Jihad” is different for extremists and non-extremists. ○ For extremists, meaning closer to “awlaki”, “islamic state”, “aqeedah” For non-extremists, closer to “muslims”, “quran”, “imams” Ambiguity of Diagnostic terms/phrases ExtremistsNon-Extremists
  • 27. ● Different Contextual Dimensions incorporating: ○ Knowledge Graphs ○ Dimension Corpora ● Utilization of Deep Learning models, generate knowledge-enhanced representations ● KG creation: Religion: Qur’an, Hadith Ideology: Books, lectures of ideologues [Not KG: Hate: Hate Speech Corpuss (Davidson et al. 2017)] ● Can be applied over many social problems. 30 Modeling Modeling Modeling Dimension 1 Dimension 2 Dimension 3 DimensionDimensionDimension Dimension Modeling Process Dimension based Knowledge enhanced Representation Contextual Dimension Modeling
  • 28. (Hate) Capturing similarity: ● Learning word similarities from a substantial knowledge graph ● A solution via distance between concepts in the knowledge graph. Modeling 31 Using a Knowledge Graph “You shall know a word by the company it keeps” (J. R. Firth 1957: 11)
  • 29. Capturing similarity (and resolving ambiguity): ● Learning word similarities from a large corpora. ● A solution via distributional similarity-based representations. Modeling 32 (Hate) Using a Corpus “You shall know a word by the company it keeps” (J. R. Firth 1957: 11)
  • 30. 33 ● Found two distinct groups employing different contexts with different density. ● Religion and Hate are usually mixed, suggesting that extremists might employ different hate tactics. ● A small group of users employ ideological context far more often than others, suggesting these users might be disseminators of ideologically intense content. Density of Dimensions in Extremist Content
  • 31. 34 ● Tri-dimension model performs best. ● Precision used as metric, to emphasize reduction on misclassification of non-extremist content. ● Implications in a large scale application. Results
  • 32. ● False alarms: significantly reduced via incorporation of three specific dimensions of context. ● Extremist users employ religion along with hate, suggesting they employ different hate tactics for their targets. ● Inclusion of all three contextual dimensions significantly reduces the likelihood of an unfair mistreatment towards non-extremist individuals, in a real world application. ● Each dimension plays different roles in different levels of radicalization, capturing nuances as well as linguistic and semantic cues better throughout the radicalization process. 35 Key Insights
  • 33. 36 Public/ Society Social Interactions Cognitive Neuro Cognitive Process ● Human brain processes information from extremist narratives on social media, that includes different contexts, emotions, sentiment, etc. ● Individuals change behavior, make choices in consuming/sharing content with an intent. ● Coordination, information flow and diffusion on social networks. ● Outcomes/impact on society through events and collective actions (eg, civil war or result of an election). Our Highly Multidisciplinary Approach
  • 35. 38 Context-Aware Harassment Detection on Social Media(wiki link) is an interdisciplinary project among the Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis), the Department of Psychology, and Center for Urban and Public Affairs (CUPA) at Wright State University. We are supported by the NSF Award#: CNS 1513721 Supporting Grants
  • 36. 39 Thank You! Special Thanks: Ugur Kursuncu and Thilini Wijesiriwardene
  • 37. 1. Hinduja, S. and Patchin, J.W., 2010. Bullying, cyberbullying, and suicide. Archives of suicide research, 14(3), pp.206-221 2. Rezvan, M., Shekarpour, S., Balasuriya, L., Thirunarayan, K., Shalin, V.L. and Sheth, A., 2018, May. A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research. In Proceedings of the 10th ACM Conference on Web Science (pp. 33-36). ACM. 3. Zeerak Waseem. Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In Proc. of the Workshop on NLP and Computational Social Science, pages 138–142. Association for Computational Linguistics, 2016. 4. Thoams Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. Automated hate speech detection and the problem of offensive language. In Proceedings of the 11th Conference on Web and Social Media. AAAI, 2017. 5. Zhang, Ziqi & Luo, Lei. (2018). Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter. Semantic Web. Accepted. 10.3233/SW-180338. 40 References

Editor's Notes

  1. In all, Facebook has added 60,100 data center-specific jobs since 2010. This equates to roughly 8,600 jobs each year.
  2. General info on social media, on how often they are used by people. Give stats on the use of social media. % in US, China etc. Social media is enabler of instant communication, further enabling good or bad outcomes. Put a picture.
  3. Interacting technical (on the left) and usability (on the right) challenges to the exploitation of the new data environment. Circle the four of these that we address. (i) Appropriate incorporation of multimodal data in the views of person, content and network, (ii) Ambiguity in the meaning of significant concepts in the content, (iii) Sparsity of important lexical and semantic cues in the domain-specific corpus, (iv) Noisy nature of social media data, that threatens performance of learning process, (v) Imbalance in a training dataset
  4. Online harassment example (prolonged): https://www.theguardian.com/society/2018/aug/03/harassed-online-for-13-years-the-victim-who-feels-free-at-last Online harassment(teenagers death) - https://www.cnn.com/2018/01/23/us/florida-cyberstalking-charges-girl-suicide/index.html -DeepFake example This slide explains the following 2 things: Online harassment can be severe due to its prolonged nature: the example is the guardian news article (being harassed for 13 years) Online harassment can increase the risk of suicide attempts in cyberbullying victims by 2 times
  5. The existing online harassment detection approaches only focus on the content of the tweet to classify the tweet into harassing or non-harassing categories (just binary classification) Our approach suggests to incorporate context in the form of People (user characteristics) and Network (follower followee characteristics) in addition to the content of the tweet to do multiclass classification of harassment. So the harassing class can be subdivided into several subclasses of harassment. Ex: sexual, political, racial, intellectual, appearance-related
  6. This slide depicts a challenge in online harassment detection; which is data sparsity. This sparsity of data can be present in the research landscape in two forms: The positive cases (harassing examples) are generally low in datasets. Ex: Davidson et al. only 5% of the entire dataset is tagged as harassing The number of subclasses inside harassment is low. Ex: Waseem et al. has only two subclasses inside harassments class. Saeedeh’s work in the slide show that we have tried to increase the identified subclasses of harassment References for this slide: A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research. Web Science, WebSci 2018, Amsterdam, The Netherlands, May 27-30, 2018 [1] Zeerak Waseem. Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In Proc. of the Workshop on NLP and Computational Social Science, pages 138–142. Association for Computational Linguistics, 2016. [2] Thoams Davidson, Dana Warmsley, Michael Macy, and Ingmar Weber. Automated hate speech detection and the problem of offensive language. In Proceedings of the 11th Conference on Web and Social Media. AAAI, 2017. [3] Zhang, Ziqi & Luo, Lei. (2018). Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter. Semantic Web. Accepted. 10.3233/SW-180338.
  7. Online harassment subjectivity article : https://www.newscientist.com/article/2140342-online-harassment-on-the-rise-but-no-one-can-agree-what-it-is/ Animation on definitions (to reduce text)
  8. What is meant by subjectivity in this slide: It means that if annotators are given a set of tweet like above (right hand side of the slide) one annotator would find the the tweets harassing but another annotator would find it non harassing. Maybe the last tweet in that interaction would be tagged as toxic (harassment) by a female annotator whereas a male annotator would find it non-harassing. This suggests that online harassment is a phenomenon that is subjective in nature. Different individuals perceive it differently
  9. Online harassment can be ambiguous. The meaning of ambiguity in this slide is as follows: Once a tweet is annotated as harassment, if we want to annotate further to reflect the subclass of harassment, we would give definitions of each subclass to annotators and ask them to annotate the tweets accordingly. So if you look at the definitions for offensive language and hate speech on the left-hand side they overlap with each other and the same tweet can be annotated by two annotated differently; falling into different subclasses. Look at the example on the right-hand side of the slide.
  10. This table depicts the following: We can look at online harassment from four major dimensions. The first dimension is the what dimension of online harassment. We can approach this dimension via 2 routes. The first route is to explore the causes of online harassment. This can also be identified as reasons for online harassment. For an example a victim of online harassment could be harassed due to his/her ethnicity or religious beliefs (Saeedeh’s types of harassment is falling into this; victims are harassed because of their sexuality, political standpoint, appearance, level of intellect). Also a harasser could harass a victim because the harasser is xenophobic, homophobic, etc. The second route is to explore the effects of online harassment felt by the victims. Victim would feel offended, discriminated, depressed or in extreme cases even suicidal. The second dimension is the how of online harassment. This can also be identified as the method of harassment. These methods could be again divided to several aspects. One such aspect is direct online harassment vs. indirect online harassment. Another aspect is based on the frequency of harassment. For an example: prolonged online harassment can be identified as cyber bullying where as one-time online harassment can be identified as cyber aggression. There are several identified methods of online harassment. Harassers could use one or a mixture of few. Such methods are flaming, doxxing, dogpiling, impersonation, public shaming, threatening, spreading rumours… The third dimension is related to the who of online harassment. This dimension explores the actors involved in online harassment. The main two actors are harassers and victims. Apart from these major players, we also can identify bystanders who either aggravate the harassment process or conciliate it. Those who aggravate the situation can be identified as aggravators and those who conciliate can be identified as empathizers. The final dimension worth exploring is the where of online harassment. This basically explores the platforms where online harassment happens. These could be social networking sites, online discussion boards, private messaging applications and online gaming platforms. The policies in these platforms also could help increase or decrease the act of harassment.
  11. This slide connects to the previous slides that we talk about the challenges in online harassment detection. One challenge was the data sparsity. To overcome this challenge we, in our group is working on text generation of positive harassing examples using GANs. The novelty is that the generator objective function of the GAN is being modified to incorporate domain specific knowledge in the form of a tweet corpus used for online harassment detection The other challenge was identifying subclasses of harassment. The solution is to use a tweet representation that leveraged the domain knowledge of harassment in the form of an offensive words lexicon. These tweet representations are then fed into multiclass classifiers for classifications
  12. Religion may play more role first, hate later
  13. Two types of resources: 1-Structured --KG 2-Unstructured --Corpora
  14. The surrounding words will represent the words in bold and italic.
  15. The surrounding words will represent the words in bold and italic.
  16. Why use precision? How 1% misclassification would translate into a big number of people being affected.
  17. Weaponization Social media is weaponizing the players with malignant intentions, causing harm on individuals and society. Examples of radicalization, online harassment leading to teenage suicides. Ambiguity Defining What harm constitutes is hard. Technical solution: data-driven solutions won’t work well. KG and domain specificity needs to be incorporated. Sparsity Even though these issues are serious and result in grave consequences, data is not abundantly available. For an example, the positive examples for online harassment was quite low in datasets that are being used for research. Data for different types of harassment is yet another hurdle. Complexity There are multiple dimensions that are affecting the dynamics of online harassment and online radicalization. These are interwoven and sometimes can depend on each other. Also These dimensions can be volatile in nature. Therefore carefully modeling the dimensions is a must.