ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep Learning

Knowledge-infused Deep Learning
Artificial Intelligence
Institute
Tutorial
Amit Sheth, Manas Gaur, Ugur Kursuncu, Ruwan Wickramarachchi, Shweta Yadav
Artiﬁcial Intelligence Institute, University of South Carolina, USA
Check Tutorial site for latest slides: http://kidl2020.aiisc.ai/

Tutorial Thesis
2
Broad Vision How do you make a system more intelligent?
Without Domain Knowledge
With Domain Knowledge
Motivational Interviewing

3
How to gain deep
understanding of the content?
Tutorial Thesis
3
agitation
nervous
panicky
>Millions
Social Media
Deep
Clustering
Neural
Parsing
Repeated
panic attacks
agitation
nervous
panicky
Repeated
panic attacks
anxiety
KG
Sleep Disorder
Circadian
Rhythm
Disorder
Context understanding
Shallow
Semantics
Deeper
Semantics
[Lin 2020, Kitaev 2018]
https://github.com/facebookresearch/deepcluster
[Gaur 2020]

Knowledge in computing
4
The role of knowledge in computing has long been recognized
- at least since Vannever Bush’s 1945 seminal piece: As We May Think.
Enhanced (semantic) applications such as search,
browsing, personalization, recommendation,
advertisement, and summarization.
Improve integration of data, including data of
diverse modalities and from diverse sources.
Empower/enhance ML and NLP techniques. Use
as a knowledge transfer mechanism across
domains, between humans and machines
Improve automation and support intelligent
human-like behavior and activities that may
involve conversations or question-answering and
robots.
~2000
~2025
Focus:
From small data to big data.
Data alone is not enough.
[Domingos 2012].
Knowledge will propel
machine understanding of
content. [Sheth, et al. 2017]

Tutorial Thesis
5
Interpretability + Traceability → Explainability
Ethics, Bias, and False Alarms
Deeper Understanding of Content including
Context Understanding
What is the right knowledge graph to use?
[Semantic, Cognitive, Perceptual Computing, Sheth 2015]
Structured and
Unstructured
Data
Models
Knowledge
Graph/Base
Compute
Application/
Workﬂow
David Cox Talk: Neurosymbolic AI
F. Lécué: On the Role of Knowledge Graphs in Explainable AI A Machine Learning Perspective

About the Tutorial
66
All About the Knowledge Graphs
Knowledge-infused Deep Learning
Knowledge-infusion: Cyber Social Threats
Knowledge-infusion: Autonomous Driving
Knowledge-infusion: DarkNet
HT-2020 Tutorial: A. Sheth, M. Gaur, U. Kursuncu, R. Wickramarachchi, & S. Yadav Knowledge-infused Deep Learning

All About Knowledge Graphs
Institute
Amit Sheth
amit@sc.edu
@amit_p

Definition
Knowledge Graphs (KG) is a
structured knowledge in a graph
representation (in many cases, labeled
property graph, or RDF or its variants). We
cannot escape the class
expressivity-computability
Tread-off.
Community is still debating exact definition.
Key differentiator: Relationships
(“relationships at the heart of semantics”).
Different/Related forms:
● Ontology : Knowledge graph after human
curation of entities and relations;
“ontological commitment”, richer KR
● Knowledge Base: flattened graph
● Lexicons: Small application-specific
flattened graph
● Knowledge Networks (KN) integrate
and combine knowledge (usually
captured as KGs) to serve a network
(community); could be from and service multiple
domains.
9Knowledge Graphs and Knowledge Networks: The Story in Brief

Expressiveness Range: Knowledge Representation and Ontologies
Catalog/ID
General
Logical
constraints
Terms/
glossary
Thesauri
“narrower
term”
relation Formal
is-a
Frames
(properties)
Informal
is-a
Formal
instance
Value Restriction
Disjointness, Inverse,
part of…
Simple Taxonomies Expressive Ontologies
Wordnet
CYC
RDF DAML
OO
DB Schema
RDFS
IEEE SUOOWL
UMLS
GO GlycOSWETO
Pharma
Ontology Dimensions After McGuinness and Finin
KEGG
TAMBIS
BioPAX
EcoCyc

Ontology Examples
Commonsense Reasoning Graph Drug Abuse Ontology
Event Ontology
Crisis Ontology

Creation & Use of Knowledge ~2000
First commercial semantic search/browsing/… on the Web and
for the content on the Web using KG. Term used for KR:
WorldModel, Ontology

Proliferation Broad-based &
Domain-Specific KGs
13
Examples of General Purpose Knowledge Graphs
1. DBpedia [Auer 2007, Lehmann 2015]
2. Yago [Rebele 2016]
3. Freebase [Bollacker 2008]
4. ConceptNet [Speer 2017]
5. Knowledge Vault [Dong 2014]
6. NELL [Mitchell 2018]
7. Wikidata [Vrandečić 2014]
Example of Healthcare-specific Knowledge Graphs
1. SNOMED-CT [ACL Chang 2020]
2. Unified Medical Language System (UMLS) [Yip 2019]
3. DataMed [JAMIA Chen 2018]
4. International Classification of Diseases (ICD-10)
[JAMIA Choi 2016]
5. DrugBank, Rx-NORM and MedDRA [ BMC Celebi 2019]
6. Drug Abuse Ontology [BMI Cameron 2013]
Many are also community-developed.

Variety of Sources for Large-scale KG and
in Different Representation
14
Linked Open Data (LOD) Schema.org (schema.org)
Data Commons
Knowledge Graph (DCKG)
(datacommons.org)
Wikidata (wikidata.org)
https://github.com/data
commonsorg/api-python https://dumps.wikimedia.org/wikid
atawiki/entities/
https://lov4iot.appspot.com/
https://github.com/schemaorg/schem
aorg

Domain-specific knowledge extraction from LOD
Linked Open
Data (LOD)
Book related
information?
Filter relevant datasets
Extract relevant portion
of a data set
Project
Gutenberg
DBpedia
DBTropes
Books, Countries, Drugs
Books, movie, games
Books
Book
specific
DBpedia
Book
specific
DBTropes
Lalithsena, Sarasi, et al. "Automatic domain identification for linked open data." Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2013 IEEE/WIC/ACM International Joint Conferences on. Vol. 1. IEEE, 2013.
Sarasi Lalithsena, Pavan Kapanipathi and Amit Sheth "Harnessing relationships for domain-specific subgraph extraction: A recommendation use case." Big Data (Big Data), 2016 IEEE International Conference on. IEEE, 2016.

Enterprise Knowledge Graphs are also very popular
16
KG enabled Web and Enterprise
Applications: Google, Amazon,
Microsoft, Siemens, LinkedIn,
Airbnb,
eBay, and Apple, as well as
smaller companies (e.g. ezDI,
Franz, Metaphactory/
Metaphacts, Semantic Web
Company, Mondeca, Stardog,
Diffbot, Siren).
Enterprise KG development
service is also available.
(Maana). Industry-Scale Knowledge Graphs: Lessons and Challenges (Communications of the ACM, August 2019)

Why Knowledge Graphs: Shortcomings of
Deep Learning (DL)
17
Trivial Case for
Classiﬁcation
Text: I sometimes wonder how many alcoholics are relapsing under the lockdowns (former
alcoholic).
Question: Does the person has addiction?
Answer: Yes
Not Trivial Text: Then others that insisted that what I have is depression even though manic episodes aren't
characteristic to depression. I dread having to retread all this again because the clinic where I get
my mental health addressed is closing down due to loss in business caused by the pandemic
Question: Does the person suffer from depression?
Answer: Yes Correct: No
Disjunctive
Questions
Question: Are you feeling nervous or anxious or on edge?
Question: Is the feeling of restlessness due to stress or anxiety ?
Questions: Does an employee own a company or work for a company?
Research in this directions: Query2Box and Multi-hop Reasoning [Ren ICLR 2020, Lin EMNLP 2018]
Covid context
Generic context
Bottom line: Most state of the art Deep learning approaches are not integrated
with prior knowledge. This tutorial is about strategies for doing so.
[David Cox] [Marcus 2018]
https://www.digitaltrends.com/cool-tec
h/neuro-symbolic-ai-the-future/

Why Knowledge Graphs: Shortcomings of DL
● Graph Convolutional Neural Networks (GCN) are blind to relation types. For example:
<shelter-in-place causes anxiety> and <shelter-in-place prevents anxiety> have similar
representations in GCN.
● Deep Clustering over unlabeled data exploits the inherent latent semantics to generate
diverse and cohesive clusters. But, interpretability of the clusters requires Knowledge Graphs.
18
ODKG: Opioid
Drug Knowledge
Graph
[Kamdar 2019]

Why Knowledge Graphs : NLP/NLU Challenges
19
● Natural Language Processing Challenges:
○ How do you learn quickly from small amount of data?
○ How do you mine (varied) relationships from existing text?
○ How do you reliably classify entities into known ontology?
○ Better contextualization of words
● Natural Language Understanding Challenges:
○ Query Interpretation or Understanding the user question
○ Answering the question with Trust and Transparency
○ How to measure “reasonability” and “meaningfulness” of the response to
a question?
○ How much context is needed to provide a precise response?
[Stanford Knowledge Graph Seminar 2020, Amit Prakash , Leilani Gilpin]

20
[Image from Talukdar]
KG in Conversational AI
● Get same/similar
answers
based on trusted
knowledge
● Personalization
● Contextualization

Personalization: taking into account
the contextual factors such as
user’s health history, physical
characteristics, environmental
factors, activity, and lifestyle.
Chatbot with contextualized (e.g asthma) knowledge is
potentially more personalized and engaging.
Without
Contextualized Personalization
With
Contextualized Personalization
KG in Conversational AI

22
How do we use Knowledge Graphs?
Health
Knowledge
Graph
[Shah and Sheth
US patent 2015]

23

Semantic
Proximity
GBV Index
GBV estimation
for 14 days
GBV Lexicon from
Tweets on bullying,
abuse. Domestic
violence, etc.
Mapping words to
categories for
expansion of lexicon
Generic Knowledge
Graph of Wikipedia
Aligning the lexicon
words and new
entities with respect
to DBpedia
Categories
Enriched Lexicon for
gathering abstract meaning
of GBV in tweets
Calculating cosine similarity
between two vectors (GBV
and Tweets) and setting
empirical threshold on
semantic proximity
Mental Health Tweets From
March 14-April 04
Analyzing Gender-based Violence (GBV) in Mental Health COVID-19 Twitter Conversation
Maximum A Posteriori
Estimation (MAP)
Purohit, Hemant, Tanvi Banerjee, Andrew Hampton, Valerie L. Shalin, Nayanesh Bhandutia, and Amit P. Sheth.
"Gender-based violence in 140 characters or fewer: A# BigData case study of Twitter." arXiv preprint arXiv:1503.02086
(2015).
Psychidemic
https://www.youtube.com/watch?
v=XzYrn0PEzNk

Assessing Mental Health Impact of COVID using News Articles
https://theconversation.com/were-measuring-online-conversation-to-track-the-social-and-mental-health-issues-surfacing-during-the-coronavirus-pandemic-135417
Multilingual KG
http://conceptnet.io/
GDelt Database
https://www.gdeltproject.org/

26
Understanding City Traﬃc Events: Role of KG in
analyzing multimodal data
Anantharam, Pramod, Payam Barnaghi, Krishnaprasad Thirunarayan, and Amit Sheth. "Extracting city traffic events from social
streams." ACM Transactions on Intelligent Systems and Technology (TIST) 6, no. 4 (2015): 1-27.

27
Drug Abuse Ontology in PREDOSE
owl:thing
prescription
_drug_
brand_name
brandname_
undeclared
brandname_
composite
prescription
_drug
monograph_
ix_class
cpnum_
group
prescription
_drug_
property
indication_
property
formulary_
property
non_drug_
reactant
interaction_
property
property
formulary
brandname_i
ndividual
interaction_
with_prescri
ption_drug
interaction
indication
generic_
individual
prescription
_drug_
generic
generic_
composite
interaction_
with_non_
drug_reactant
interaction_with_mo
nograph_ix_class

28
PREDOSE
Cameron, Delroy, Gary A. Smith, Raminta Daniulaityte, Amit P. Sheth, Drashti Dave, Lu Chen, Gaurish Anand, Robert Carlson, Kera Z. Watkins, and Russel Falck.
"PREDOSE: a semantic web platform for drug abuse epidemiology using social media." Journal of biomedical informatics 46, no. 6 (2013): 985-997.

29
Knowledge Graph in Education
Educational
Knowledge Base
Enable expert system to answer questions
like:
1. What to study when there is less
time?
2. How to set a good question paper?
3. How to cover-up learning gaps from
previous years?
4. How to connect 8th grade with 10th
grade science?
Mentor Intelligence
mimics teacher thinking
Intelligent Content Authoring
and Curation
1. Granularity
2. Personalization (student or institution)
3. Robustness
4. Interventional: diagnosis and remedy

30
Named Entity Recognition Relationship Extraction Entity Linking Implicit information extraction
Implicit Entity Linking using KG and Conditional Random Fields
Perera, Sujan, Pablo N. Mendes, Adarsh Alex, Amit P. Sheth, and Krishnaprasad Thirunarayan. "Implicit entity linking in tweets." In
European Semantic Web Conference, pp. 118-132. Springer, Cham, 2016.

31
Experiences or
Factual Knowledge
Abstract Knowledge
1. Continuum of Knowledge
2. relationship mapping:
NLU through and knowledge
transfer across domains
Analogical Generalization
Applicable to new situation via
analogy
[Forbus and Gentner 1997, Gentner
and Medina 1998]
Mapping between Enzyme Kinetics
and Musical Chairs [Ongoing]
Mapping between two Conceptual Frames in
similar domain (Physics)

ML and Knowledge Graphs: Pipeline
32
Knowledge
Extraction
Knowledge
Alignment
Knowledge
Cleaning
Knowledge Mining &
Knowledge-based QA
Data Extraction
(NLP, Web)
Wrapper Induction
(DB, DM-Data
Mining)
Web Tables (DB)
Text Mining (DM)
Entity and
Relationship Linking
[Perera 2016]
Schema Mapping
and Ontology
Mapping
[Jain 2010]
Universal Schema
[Sheth 1990]
Data Cleaning
[Jadhav 2016]
Anomaly Detection
[Anantharam 2012,
2016]
Knowledge Fusion
[Sheth 2020,
Kapanipathi 2020,
Gaur 2018,
Kursuncu 2020]
Graph Mining [Lalithsena
2016, 2017, 2018]
Knowledge Embedding
[Wickramarachchi 2020,
Gaur 2018]
Search [Sheth 2003,
Cheekula 2015, Kho
2019]
QA [Alambo 2019,
Shekarpour 2017]
[Stanford Knowledge Graph Seminar 2020, Luna Dong]

33
More Applications and Domains that use KG+DL
Pharmacy
[Futia 2020,
Gentile 2019]
Personalized
mHealth
[ Sheth 2017, 2018a,
2018b, 2019]
Public Health
[Yazdavar 2017,
Gaur 2018,
Daniulaityte 2016]
Question Answering/
Dialog System
[Alambo 2019,
Shekarpour 2017, 2015 ]
Hypothesis Generation to ﬁnd
association between Stress
and Colorectal cancer
[ Cameron 2015]
Chatbot with contextualized (e.g
asthma) knowledge is potentially more
personalized and engaging.

34
Knowledge-infused
Learning Methods (Internet Computing’19, AAAI’20, CIKM’18, WWW’19, NAACL’18, ACL’17, ….)
Where are we
[Stanford Knowledge Graph Seminar 2020, Christopher Re]

35
What is Knowledge infusion? Why do we need it?
What is Knowledge-infused Learning?
What are the different types of Knowledge-infused Learning?
How can Knowledge-infused Learning provide solutions to
complex problems:
Unstructured Healthcare on Social Media
Radicalization on Social Media
Autonomous Driving Vehicles
Drug Traﬃcking in Cryptomarkets
Questions we address next

36
Vision: KG as Glue in Developing Hybrid AI Systems
STATISTICAL AI
CONNECTIONIST
“Unreasonable effectiveness of big data”
in machine processing &
powering bottom up processing
“Unreasonable effectiveness of small data”
in human decision making - can this be
emulated to power top down processing?
SYMBOLIC AI
FORMAL
KG will play an increasing role in developing hybrid neuro-symbolic systems (that is bottom-up
deep learning with top-down symbolic computing) as well as in building explainable AI systems
for which KGs will provide scaffolding for punctuating neural computing.
Cognitive Science Analogy: Combining Top Brain - Bottom Brain Processes.

Knowledge-infused Deep
Learning (KiDL)
Institute
Manas Gaur
mgaur@email.sc.edu
@manasgaur90

38
● Ambiguous online healthcare communications and difficult to
engineer discriminative features.
● Domain-specific embedding models provide a shallow infusion of knowledge.
● Decrease the dependence on large datasets
● Reduce bias in the dataset (ie: potentially avoid social discrimination and
unfair treatment)
● Provide information provenance: Allowing explainability of a model
● Improve information coverage specific to a domain that would be missed
otherwise
● Reduce time and space complexity of the models architecture
● Improve models sensitivity and specificity
● Explainability
Why Knowledge-infused Deep Learning ?

39
Deep NLP Requires Background Knowledge
An excessive endogenous or exogenous
stimulations by estrogen induces adenomatous
hyperplasia of the endometrium
● adenomatous modiﬁes hyperplasia
● An excessive endogenous or exogenous
stimulations modiﬁes estrogen
● “adenomatous hyperplasia” and
“endometrium” occurs as “adenomatous
hyperplasia of the endometrium”
MeSH
Terms in
PubMed
Articles
[Ramakrishnan 2008]
[Gaur 2018, 2019, Limsopatham 2016 ]

40
Gkotsis, George, Anika Oellrich, Tim Hubbard, Richard Dobson, Maria Liakata, Sumithra Velupillai, and Rina Dutta. "The language of mental health problems in social media." In
Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology, 2016.
NLP Requires Background Knowledge

41
How do you know that a training set has a good domain coverage?
How do ensure consistency of labeling, esp when label is not binary?
Do labels represent adequate semantics (e.g., number of alternatives)?
Do they have adequate domain knowledge?
How do you ensure consistency of labeling (interpretation)?
Questions

42
Weak → Distant → Knowledge-infused

43
Weak Supervision using SNORKEL
https://www.snorkel.org/
[Ratner 2017, Bach 2019]

44
SEMANTIC &
KNOWLEDGE GRAPH
Top-down symbolic
approach (concepts, rules)
in data reasoning,
inferencing, and deduction.
MACHINE/ DEEP
LEARNING
Bottom-up statistical
approach in searching,
analyzing and deriving
insights from Big Data.
In Abstract Sense

45
Theoretically Why KiDL: Probably Approximately Correct Learning
Valiant, Leslie G. "Robust logics." Artificial Intelligence 117.2 (2000): 231-253.

46
Theoretically Why KiDL: Probably Approximately Correct Learning
How do you know that a training set has a
good domain coverage?
Robust Classifier → Low Generalizability Error
Consistent Classifier → Low Training Error
Confidence: More Certainty (lower δ)
means more number of samples.
Complexity: More complicated
hypothesis (|H|) means more number
of samples

47
K-IL: “The exploitation of domain knowledge and application semantics
to enhance existing deep learning methods by infusing relevant conceptual
information into a statistical, data-driven computational approach
(Neuro-Symbolic AI).”
A. Sheth, M. Gaur, U. Kursuncu and R. Wickramarachchi, "Shades of Knowledge-Infused Learning for Enhancing Deep Learning," in IEEE
Internet Computing, vol. 23, no. 6, pp. 54-63, 1 Nov.-Dec. 2019, doi: 10.1109/MIC.2019.2960071.

48
of knowledge graphs to
improve the semantic
and conceptual
processing of data.
SEMI-DEEP Infusion
Deeper and congruent
incorporation or
integration of the
knowledge graphs in the
learning techniques.
DEEP Infusion
(Part of Future KG Strategy)
combines statistical AI
(bottom-up) and symbolic AI
learning techniques
(top-down) for hybrid and
integrated intelligent systems.
SHALLOW Infusion
Taxonomy of Knowledge Infusion

49
Shallow external knowledge is described as those form of information which are extracted from
text based on some heuristics, often designed for task-speciﬁc problems:
○ Bag of Words/Phrases from Corpus [Hagoort 2004, Zhang 2019, Sun 2019]
○ Bag of Words/Phrases from Semantic Lexicons [Faruqui 2014, Mrkšić 2016]
○ Count of Nouns, Pronouns, Verbs [Gkotsis 2017, 2016]
○ Sentiment and Emotions of the sentence [Gaur 2019, Vedula 2017, Kursuncu 2019]
○ Latent topics describing the documents [Jiang 2016, Li 2016, Meng 2020]
○ Label assignment to words or phrases in sentence (Semantic Role Labeling):
Shallow Infusion
Mary sold the book to John
Agent ThemePredicate Recipient

50
Knowledge: Domain
specific large corpora
Knowledge: pre-trained
embeddings + semantic
lexicons
Knowledge: Domain
specific large corpora
Word2Vec Retrofitting BERT
“Context is represented by a
set of words for a given target
word”
“Learned embeddings are further
enriched by using semantic
lexicons”
“Uses language modeling
objective to learn the
contextual representations”
Examples of Shallow Infusion

51
Chronological
arrangement of shallow
Infusion techniques
From NLP domain
Shallow Infusion of Knowledge in Deep Learning: In Brief

52
Shallow Infusion: Retrofitting Example
52
damage
Infrastructure
affected
population
damage
Infrastructure
affected
population
Vector representation of words in
Tweets before retrofitting
Vector representation of words in
Tweets after retrofitting
MOAC Ontology
Empathi ontology
Disaster Ontology
DBpedia
Gaur, Manas, et al. "empathi: An ontology for Emergency Managing and Planning about Hazard Crisis." 2019 IEEE 13th International Conference on Semantic
Computing (ICSC). IEEE, 2019.

53
SuicideWatch Subreddit
(93K Users)
NYC CDRN EHR (123K patients) Data specific to
Mental Health
Medical Knowledge Bases
We identiﬁed self-harm, depressive feelings, and suicide ideations as
latent topics expressed in Reddit and EHR data.
Both sources did not provide evidence of mentions or expressions of
impulsivity, family violence, and drug abuse.
Shallow Infusion: Association between Social Media
and EHR in Suicide-related Communications
[Gaur, Psychiatry Under Review 2020]

54
Semi-Deep Infusion
In semi-deep infusion, external knowledge is
involved through attention mechanism or
learnable knowledge constraints acting as a
sentinel to guide model learning.
➢ External Knowledge through Attention
➢ External Knowledge through Learnable Constraints

55
Tacit
Knowledge
Self-aware or
External Knowledge
Similarity
based
veriﬁcation
Semi-Deep Infusion
Dataset
Deep Learning
Model
Dataset
enrich
Deep Learning
Model
Tacit
Knowledge
Hypothesis testing
or similarity-based
veriﬁcation
Shallow Infusion
Self-aware or
External Knowledge
Comparing Semi-Deep Infusion with Shallow Infusion
Sheth, Amit, Manas Gaur, Ugur Kursuncu, and Ruwan Wickramarachchi. "Shades of Knowledge-Infused Learning for
Enhancing Deep Learning." IEEE Internet Computing 23, no. 6 (2019): 54-63.

56
A neural attention mechanism equips a neural network with the ability to focus on a subset of its
inputs (or features):
○ Hard Attention or Position specific attention : location of important entities and
relationship in the text are hard-coded in the model. Thus allowing efficiency in feature
engineering, however, the model suffer from exposure bias.
○ Soft Attention: The model learns to attend to specific parts of the text while
generating the word describing that part (following distributional semantics).
○ Attention with Knowledge base: background knowledge is integrated using an
attention mechanism, which decide whether to attend to background knowledge and
which information from KBs is useful.
External Knowledge through Attention

57
● Learnable constraints are empirical thresholds (probabilistic value) learnt by the
model which allows it to adaptively learn.
● It can be done in following ways:
○ Learning based on pre-structured axiomatic rules - axiomatic knowledge
○ Learning based on difference in content similarity - KL Divergence,
Cross-entropy loss
○ Learning based on commonsense knowledge - ConceptNet
○ Learning over different permutations of text generated through synonyms,
antonyms, and homonyms.
External Knowledge through Learnable Constraints

58
Methods for Semi-Deep Infusion

59
_______ meant to ______ not to ______
Template: ﬁll in the blanks
It was meant to dazzle not to make sense
Target:
Generative
Model
It was meant to dazzle not to make it
Infilling Content
Matching through
averaged KL
Divergence
Learnable knowledge
constraint module
Learnable Constraints
Hu, Zhiting, Zichao Yang, Russ R. Salakhutdinov, L. I. A. N. H. U. I. Qin, Xiaodan Liang, Haoye Dong, and Eric P. Xing. "Deep generative
models with learnable knowledge constraints." In Advances in Neural Information Processing Systems, pp. 10501-10512. 2018.
Replace the sentence
with KG or Resource

Semi-Deep Infusion : KG GANs
Generative Adversarial Network*
*Chang, Che-Han, Chun-Hsien Yu, Szu-Ying Chen, and Edward Y. Chang. "KG-GAN: Knowledge-Guided Generative Adversarial Networks."
arXiv preprint arXiv:1905.12261 (2019).
Seen Category
Data
UnSeen
Category Data
Generator
(G1
)
Generator
(G2
)
Z1
Z2
Real Data
Fake Data
(G1
)
Fake Data
(G2
)
Discriminator
(D)
Embedding
Regression
Network
Semantic
Embedding of
Unseen
Category
Prediction
(G2
)
Prediction
(G1
)
≅
Parameter
Sharing
Loss
(G1
)
Loss
(G2
)
Real or Fake
Objective
Function

Variants:
1. Knowledge base at each LSTM cell [1].
2. K-IL layer [2]:
a. 1D Convolutional Neural Network for mixing
b. Graph Convolutional Neural Network -- When
hierarchical structure of KG is important and
need to be preserved in representation.
c. Simple Multi-layer Perceptron.
[1] Yang, Bishan, and Tom Mitchell. Leveraging knowledge bases in lstms for improving machine reading. arXiv preprint arXiv:1902.09091 (2019).
[2] Ugur Kursuncu, Manas Gaur, and Amit Sheth. Knowledge Infused Learning (K-IL): Towards Deep Incorporation of Knowledge in Deep Learning. Proceedings of the AAAI 2020 Spring Symposium on
Combining Machine Learning and Knowledge Engineering in Practice (AAAI-MAKE 2020).
Semi-Deep Infusion : LSTMs

62
Deep Infusion (Vision)
Ugur Kursuncu, Manas Gaur, and Amit Sheth. Knowledge Infused Learning (K-IL): Towards Deep Incorporation of Knowledge in Deep Learning. Proceedings of the AAAI 2020 Spring
Symposium on Combining Machine Learning and Knowledge Engineering in Practice (AAAI-MAKE 2020).

K-IL : Objective Functions and Evaluation
Kullback Leibler Divergence
● Measures the Information loss during
the learning phase between
Latent/hidden states and KGs
● KG Embeddings: TransE, HoIE etc.
● Models: Variational Autoencoders,
LSTMs, GANs, Siamese Neural
Networks
● Frameworks: Zero Shot Learning ,
One Shot Learning, Transfer Learning,
Parameter Sharing
● Other Variants: Jensen Divergence,
Regularization, Integer Linear
Programming
Kosheleva, Olga, and Vladik Kreinovich. "Why deep learning methods use KL divergence instead of least squares: a possible pedagogical explanation." Математические структуры и
моделирование 2 (46) (2018).
Evaluation: Before and After
Knowledge-infusion
Methods (Apart from Precision, Recall, F1-score):
● Frechet Inception Distance : measure of similarity
between two datasets (KG & Training Data)
● Statistical Signiﬁcance Hypothesis Testing
● Word and Concept Features
● T-SNE Visualization of Clusters
● Area under perturbation curve:
Feature Ranking
● Human-centric evaluation: Crowdsourcing,
User Satisfaction, Mental Model, Trust
Assessment, Correctability
OF
EV
http://www-sop.inria.fr/members/Freddy.Lecue/presentation/ISWC2019-FreddyLecue-Thales-OnTheRoleOfK
nowledgeGraphsInExplainableAI.pdf

64
Knowledge Infusion: Abstractive Summarization
of Clinical Diagnostic Interviews
Problem
Statement

65
of Clinical Diagnostic Interviews (Approach)

66
Sentence
length
Trigram language modeling
Informativeness
Find best path for
an interview slice

67
BERT
Abstractive Summarization
using Integer Linear
Programming (ILP)
Abstractive Summarization
using ILP and PHQ-9
Statistical Statistical + Constraints
Statistical + Constraints
+ Knowledge
Manas G, Aribandi V, Kursuncu U, Alambo A, Shalin VL, Thirunarayan K, Beich J, Narasimhan M, Sheth A Knowledge-infused Abstractive Summarization of Clinical Diagnostic
Interviews , JMIR Preprints. 30/05/2020:20865 DOI: 10.2196/preprints.20865 URL: https://preprints.jmir.org/preprint/20865

6868
Really struggling with my bisexuality which is causing chaos in my relationship with a girl. Being
a fan of LGBTQ community, I am equal to worthless for her. I’m now starting to get drunk
because I can’t cope with the obsessive, intrusive thoughts, and need to get out of my head.
BPD
DICD PND SAD SBI OCD
Don’t want to live anymore. Sexually assault, ignorant family members and my never
ending loneliness brights up my path to death.
SCW
PND SBI SAD DPR DICD
DPR
I do have a potential to live a decent life but not with people who abandon me.
Hopelessness and feelings of betrayal have turned my nights to days. I am developing
insomnia because of my restlessness.
SBI DPR DICD
BPD
I just can’t take it anymore. Been abandoned yet again by someone I cared about. I've been
diagnosed with borderline for a while, and I’m just going to isolate myself and sleep forever.
SBI PND
Linking Reddit to DSM-5 : Web-based Intervention
Reddit DSM-5 [Gaur 2018]

6969
Mapping to SNOMED Concept Illustration
Really struggling with my bisexuality which is
causing chaos in my relationship with a girl.
Being a fan of LGBTQ community, I am equal to
worthless for her. I’m now starting to get drunk
because I can’t cope with the obsessive,
intrusive thoughts, and need to get out of my
head.
288291000119102: High risk bisexual behavior
365949003: Health-related behavior ﬁnding 365949003: Health-related behavior ﬁnding
307077003: Feeling hopeless
365107007: level of mood
225445003: Intrusive thoughts
55956009: Disturbance in content of thought
26628009: Disturbance in thinking
1376001: Obsessive compulsive personality disorder

7070
Mapping Reddit to DSM-5
Medical Knowledge Bases
N-grams
(n=1, 2, 3)
LDA
LDA over
Bi-grams
Normalized
Hit
Score
DSM-5
Lexicon
<Reddit Post>
<Subreddit Label>
Input
<Reddit Post>
<DSM-5 Label>
Output
DAO
Drug Abuse
Ontology

71
Mapping Reddit to DSM-5
http://www.papersfromsidcup.com/graham-daveys-blog/changes-in-dsm-5

7272
Reddit to DSM-5
Task
I know you want me to say no and that it is a
part of me blah blah blah. But I can't. Honestly,
not having bipolar disorder would be a huge
blessing. I would be so much happier and
could control my life better. I wouldn't have
frantic, scattered thoughts and depression. I
would be normal, happy, and less dramatic.
Bipolar Subreddit
DSM-5: Depressive Disorder
I know you want me to say no and that it is a
part of me blah blah blah. But I can't. Honestly,
not having bipolar disorder would be a huge
blessing. I would be so much happier and
could control my life better. I wouldn't have
frantic, scattered thoughts and depression. I
would be normal, happy, and less dramatic.
BiPolar
Depression
Disorder
Subreddits DSM-5
Chapter
BiPolarReddit
BiPolarSOS
Depression
Addiction
Substance use &
Addictive Disorder
Crippling Alcoholism
Opiates Recovery
Opiates
Self-Harm
Stop Self-Harm

7373
Semantic Encoding and Decoding Optimization
12808
Words
300 dimension embedding 300 dimension embedding
20 DSM-5
Categories
R
D
Reddit Word
Embedding Model
DSM-5 -DAO
Lexicon
W
Solvable Sylvester Equation

74
Semantic Encoding and Decoding Optimization
Encoding DSM-5 to Reddit embedding space
Decoding Reddit to DSM-5 embedding space

75
Outcome
Domain-specific
Knowledge lowers
False Alarm Rates.
2005-2016
550K Users
8 Million
Conversations
15 Mental Health
Subreddits
[Gkotsis 2017][Saravia 2016]
[Park 2018]

76
Method (with HLF, VLF, and FGF) Precision Recall F1-Score
BRF- Contextual Features (CF) 0.60 0.54 0.57
BRF - CF (SEDO Weights generated from DSM-5
Lexicon without DAO)
0.87 0.77 0.82
Lexicon with DAO without Slang Terms)
0.87 0.80 0.83
Lexicon without DAO with Slang Terms)
0.85 0.82 0.83
BRF- CF (SEDO Weights generated from DSM-5
Lexicon with DAO and Slang Terms)
0.88 0.83 0.85
Outcome
Model and Annotator Agreement: 84%

Mapping Social Media to EHR using KG
77
TwADR
AskaPatient
Drug Abuse
Ontology
DSM-5 Lexicon
Suicide Risk
Severity Lexicon
Treatment Information
Observation and
Drug-related
Information
Mental Health Condition
Suicide Risk Levels
Ideation
Behavior
Attempt

78
Mental Healthcare KB for Social Media

Resources
TwADR and
AskaPatient
Lexicon
https://zenodo.org/record/55013#.XsYEH8YpBQI
Ref: Limsopatham, Nut, and Nigel Collier. "Normalising medical concepts in social media texts by learning
semantic representation." Association for Computational Linguistics, 2016.
Suicide-Risk
Severity
Lexicon
https://bit.ly/SRS_lexicon
Ref: Gaur, Manas, Amanuel Alambo, Joy Prakash Sain, Ugur Kurşuncu, Krishnaprasad Thirunarayan, Ramakanth
Kavuluru, Amit Sheth, Randy Welton, and Jyotishman Pathak. "Knowledge-aware assessment of severity of suicide
risk for early intervention." In The World Wide Web Conference, 2019.
DSM-5 and Drug
Abuse Ontology
Lexicon
https://bit.ly/DSM5_DAO
Ref: Gaur, Manas, Ugur Kurşuncu, Amanuel Alambo, Amit Sheth, Raminta Daniulaityte, Krishnaprasad Thirunarayan,
and Jyotishman Pathak. "" Let Me Tell You About Your Mental Health!" Contextualized Classiﬁcation of Reddit Posts
to DSM-5 for Web-based Intervention." In Proceedings of the 27th ACM International Conference on Information and
Knowledge Management, 2018.
Suicide Risk
Severity Dataset
(Reddit)
https://zenodo.org/record/2667859#.XsYH7MYpBQI
Ref: Gaur, Manas, Amanuel Alambo, Joy Prakash Sain, Ugur Kurşuncu, Krishnaprasad Thirunarayan, Ramakanth
Kavuluru, Amit Sheth, Randy Welton, and Jyotishman Pathak. "Knowledge-aware assessment of severity of suicide
risk for early intervention." In The World Wide Web Conference, 2019.

Other Works: Not Covered
80
Manas Gaur, Amanuel Alambo, Joy Prakash Sain, Ugur Kursuncu, Krishnaprasad Thirunarayan, Ramakanth
Kavuluru, Amit Sheth, Randy Welton, and Jyotishman Pathak.
Knowledge-aware assessment of severity of suicide risk for early intervention. In WWW 2019
Manas Gaur, Vamsi Aribandi, Amanuel Alambo, Ugur Kursuncu, Krishnaprasad Thirunarayan, Jonathan Beich,
Jyotishman Pathak, and Amit Sheth
Characterization of Time-variant and Time-invariant Assessment of Suicidality on Reddit using C-SSRS
Under Review in Nature Scientiﬁc Reports
Manas Gaur, Aditya Sharma, Ugur Kursuncu, Valerie L. Shalin and Amit Sheth
Knowledge-Guided Convolutional Autoencoder Clustering for Associating Support Seeker and Support
Providers in Online Mental Health Communities
Amanuel Alambo and Krishnaprasad Thirunarayan
Depressive, Drug Abusive, or Informative: Knowledge-aware Study of News Exposure during COVID-19
Outbreak. In ACM KDD KiML Workshop 2020

References
● Manas Gaur, Ugur Kursuncu, Amanuel Alambo, Amit Sheth, Raminta Daniulaityte, Krishnaprasad Thirunarayan, and Jyotishman Pathak. "" Let Me Tell
You About Your Mental Health!" Contextualized Classiﬁcation of Reddit Posts to DSM-5 for Web-based Intervention." CIKM, 2018.
● Manas Gaur*, Chidubem Arachie*, Sam Anzaroot, William Groves, Ke Zhang, and Alejandro Jaimes. "Unsupervised Detection of Sub-Events in Large
Scale Disasters." AAAI 2020.
● Manas Gaur, Amanuel Alambo, Joy Prakash Sain, Ugur Kursuncu, Krishnaprasad Thirunarayan, Ramakanth Kavuluru, Amit Sheth, Randy Welton, and
Jyotishman Pathak. "Knowledge-aware assessment of severity of suicide risk for early intervention." WWW 2019.
● Manas Gaur, Saeedeh Shekarpour, Amelie Gyrard, and Amit Sheth. "empathi: An ontology for emergency managing and planning about hazard crisis."
ICSC, 2019.
● Gyrard, Amelia, Manas Gaur, Saeedeh Shekarpour, Krishnaprasad Thirunarayan, and Amit Sheth. "Personalized health knowledge graph." ISWC 2018.
● Amit Sheth, Manas Gaur, Ugur Kursuncu, and Ruwan Wickramarachchi. "Shades of knowledge-infused learning for enhancing deep learning." IEEE
Internet Computing 2019.
● Shreyansh Bhatt, Manas Gaur, Beth Bullemer, Valerie Shalin, Amit Sheth, and Brandon Minnery. "Enhancing crowd wisdom using explainable diversity
inferred from social media." Web Intelligence 2018.
● Kursuncu, Ugur, Manas Gaur, and Amit Sheth. "Knowledge Infused Learning (K-IL): Towards Deep Incorporation of Knowledge in Deep Learning.", AAAI
Spring Symposium 2020.
● Williams, Ronald J., and David Zipser. "A learning algorithm for continually running fully recurrent neural networks." Neural computation, 1989.
● Lamb, Alex M., Anirudh Goyal Alias Parth Goyal, Ying Zhang, Saizheng Zhang, Aaron C. Courville, and Yoshua Bengio. "Professor forcing: A new
algorithm for training recurrent networks." NIPS 2016.
● Yang, Bishan, and Tom Mitchell. "Leveraging Knowledge Bases in LSTMs for Improving Machine Reading." ACL 2017.
● Hu, Zhiting, Zichao Yang, Russ R. Salakhutdinov, L. I. A. N. H. U. I. Qin, Xiaodan Liang, Haoye Dong, and Eric P. Xing. "Deep generative models with
learnable knowledge constraints.” NIPS 2018.
81

Cyber Social Threats
Institute
kursuncu@mailbox.sc.edu
@UgurKursuncu

Critical Points on Cyber Social Threats
● Context in social media
conversations is ﬂuid and shades of
gray.
● False alarms in the models
developed and deployed.
● Ethical considerations and
consequences. Bias and
transparency. Implications on the
mass population.
● The role of knowledge on improving
the model in these critical points.
84
Photo: @budhelisson Unsplash.com

Online Extremism - Ongoing Open Problem
85
● Efforts by online platforms are
inadequate.
● Governments insist that the
industry has a ‘social
responsibility’ to do more to
remove harmful content.
● If unsolved, social media
platforms will continue to
negatively impact the society.

Online Extremism - Covid 19
8686

(e.g., recruiter, follower) with
respect to different stages of
radicalization.
Modeling users
content and psychological
process over time.
Persuasive
relevant to Islamist
extremism.
Domain Knowledge
of the context (“jihad” has
different meaning in different
context)
Multidimensionality
Radicalization
Challenges & Potential Solutions

88
0
None
Mainstream
religious views
and
orientations
Indicator:
Islam; Allah;
jihad (self
struggle); halal;
democracy,
islam, salah,
fatwa, hajj.
1
Low
Attitudinal support
for politically
moderate
Islamism
Indicator:
Hadith; Caliphate
(Khilafah)
justified;
Sharia better
(than secular
law);
Hypocrisy west.
2
Elevated
Emergent
support for
exclusive rule of
the Shari’a law
Indicator:
Shariah best;
revenge
(justified); jihad
(against West);
justify Daesh
(ISIS)
3
High
Support for
extremist networks
and travel to “Darul
Islam”
Indicator:
Kafir; infidel;
hijrah to
Darul-Islam;
(supporting)
fatwa
Al-Awlaki;
mushrikeen.
4
Severe
Call for action
to join the fight
and the use of
violence.
Indicator:
apostate;
sahwat; taghut;
kill; kafir; kuffar;
murtadd;
tawaghit;
al_baghdadi;
martyrdom
khilafah
Radicalization Scale (Dilshod Achilov et al.)

89
Analysis of content in context can provide deeper understanding
of the factors characterizing the radicalization process.
Non-extremist
ordinary
individual
Radicalized
extremist
individual
0 1 2 4
SevereHighLowNone Elevated
3
Radicalization Process over time

Cautionary Note
90
Speciﬁcally, unfair
classiﬁcation of non-extremist
individuals as extremist.
False alarm might potentially
impact millions of innocent
people.
Local and Global security implications,
Need for reliable and fair of predicting
online terrorist activities.

● Veriﬁed and suspended by Twitter.
● Time frame: Oct 2010 – Aug 2017
● Includes 538 extremist users, from two resources. (Fernandez, 2018) (Ferrara,
2016)
○ Twitter veriﬁed users by anti-abuse team.
○ Lucky Troll Club
● 538 Non-extremist users were created from an annotated muslim religious
dataset that contains Muslim users. (Chen, 2014)
-Miriam Fernandez, Moizzah Asif, and Harith Alani. 2018. Understanding the roots of radicalisation on twitter. In Proceedings of the 10th ACM Conference on Web
Science.
-Emilio Ferrara, Wen-Qiang Wang, Onur Varol, Alessandro Flammini, and Aram Galstyan. 2016. Predicting online extremism, content adopters, and interaction reciprocity.
In International conference on social informatics.
-Chen, L., Weber, I., & Okulicz-Kozaryn, A. (2014, November). US religious landscape on Twitter. In International Conference on Social Informatics (pp. 544-560). Springer,
Cham.
Dataset

Extremist Content
92
Prevalent Key Phrases Prevalent Topics
isis, syria, kill, iraq, muslim, allah, attack, break, aleppo, assad,
islamicstate, army, soldier, cynthiastruth, islam, support, mosul,
libya, rebel, destroy, airstrike
Caliphate_news, islamic_state, iraq_army, soldier_kill,
iraqi_army, syria_isis, syria_iraq, assad_army, terror_group,
shia_militia, isis_attack, aleppo_syria, martyrdom_operation,
ahrar_sham, assad_regime, follow_support, lead_coalition,
turkey_army, isis_claim, kill_isis
Imam_anwar_awlaki, video_message_islamicstate,
fight_islamic_state, isisclaim_responsibility_attack,
muwahideen_powerful_middleeast, isis_tikrit_tikritop,
amaqagency_islamicstate_fighter, sinai_explosion_target,
alone_state_fighter, intelligence_reportedly_kill,
khilafahnew_islamic_state, yemanqaida_commander_kill,
isis_militant_hasakah, breakingnew_assad_army,
isis_explode_middle, hater_trier_haleemah, trust_isis_tighten,
qamishlus_isis_fighting, defeat_enemy_allah,
kill_terrorist_baby, ahrar_sham_leader
islamic state, syria, isis, kill, allah, video, minute
propaganda video scenes, jaish islam release,
restock missile, kaffir, join isis, aftermath, mercy,
martyrdom operation syrian opposition, punish libya
isis, syria assad, islam sunni, swat, lose head,
wilayatalfurat, somali, child kill, takfir, jaish fateh,
baghdad, iraq, kashmir muslim, capture, damascus,
report rebel, british, qala moon, jannat, isis capture,
border cross, aleppo, iranian soldier, tikrit tikrittop,
lead shia military kill, saleh abdeslam refuse
cooperate
Green: Religion
Blue: Ideology
Red: Hate
Corpus: 538 Twitter verified extremists, 48K tweets

93
● Dimensions to deﬁne the context:
○ Based on literature and our empirical study of the data,
three contextual dimensions are identiﬁed:
Religion, Ideology, Hate
● The distribution of prevalent terms (i.e., words, phrases,
concepts) in each dimension is different.
● Different dimensions needed to contextualize and
disambiguate common ‘diagnostic’ terms (e.g., jihad).
Multidimensionality of Extremist Content

94
“Reportedly, a number of
apostates were killed in
the process. Just
because they like it I
guess.. #SpringJihad
#CountrysideCleanup”
“Kindness is a language
which the blind can see
and the deaf can hear
#MyJihad be kind
always”
“By the Lord of Muhammad (blessings and peace be upon him)
The nation of Jihad and martyrdom can never be defeated”
“Jihad” can appear in tweets with different meanings in different dimensions of
the context.
H
I
R
Example Tweets with “Jihad”

95
● Same term can have different
meanings for each dimensions.
● Example:
“Meaning of Jihad” is different
for extremists and
non-extremists.
○ For extremists, meaning closer to
“awlaki”, “islamic state”, “aqeedah”
○ For non-extremists, closer to
“muslims”, “quran”, “imams”
ExtremistsNon-Extremists
Ambiguity of Diagnostic terms/phrases

Contextual Dimension Modelling
96
● Different Contextual Dimensions
incorporating:
○ Knowledge Graphs
○ Dimension Corpora
● Utilization of Machine/Deep Learning
models, generate knowledge-enhanced
representations
● Resources for Dimensions:
Religion: Qur’an, Hadith
Ideology: Books, lectures of ideologues
Hate: Hate Speech Corpus (Davidson et
al. 2017)
● Can be applied over many social
problems.
Modeling
Modeling
Modeling
Dimension 1
Dimension 2
Dimension 3
DimensionDimensionDimension
Dimension Modeling
Process
Dimension based
Knowledge
enhanced
Representation

(Hate)
Using a Knowledge Graph
“You shall know the word by the company it keeps” - J.R. Firth (1957:11)
97
Capturing similarity:
● Learning word similarities from a substantial knowledge graph
● A solution via distance between concepts in the knowledge graph.
Modeling

(Hate)
Using a Corpus
“You shall know the word by the company it keeps” - J.R. Firth (1957:11)
Capturing similarity (and resolving ambiguity):
● Learning word similarities from a large corpora.
● A solution via distributional similarity-based representations.
98
Modeling

● For religion:
Extremist and non-extremist users are signiﬁcantly similar to each other.
● For hate:
Extremist and non-extremist users do not show much similarity.
Religion Ideology
NonExtremists
Extremists
99
Religion Ideology Hate
User Similarity

● For religion and hate, among extremists:
There seems to be a number of users that are signiﬁcantly different from
each other.
● Possibility of outliers.
Extremists
Extremists
100
Religion Ideology Hate
User Similarity

● A group of extremist users, form a cluster farther from other users for
Religion and Hate.
● Suggesting there might be outliers in the dataset.
101
User Visualization for Dimensions

● Randomly selected 10 users and visualize for each dimension.
● Repeated this selection many times, every time same users formed a
separate cluster. In this case below, the users are D, A.
102
Random 10 Users
User Visualization for Dimensions

● Identiﬁed 99 (18%), 48 (9%) and 141 (26%)
users in the extremist dataset, clustered as
likely outliers for religion, ideology and hate,
respectively.
● A random sample of 76 users (15% ) from the
extremist dataset, to validate the identiﬁed
potential likely outliers.
● Our domain expert annotated these users as
likely extremist, likely extremist and unclear.
Kappa Score = 82%
Separation of users within the extremist dataset
through clustering
Mann-Whitney U-test
Outlier Detection

● Obtained the set of 49 outlier users in
the extremist dataset. Rest is labeled
as likely extremists
● Content of the outlier users contains
the following prevalent concepts:
marriage, Allah, bonded, silence, Islam
leaders, Berjaya hilarious, cake, miss
mit, kemaren, Quran, Khuda, prophet,
Muhammad, Ahmad.
Separation of users within the extremist dataset
through clustering
Outliers

Results
105
● Tri-dimension model
performs best.
● Precision used as metric, to
emphasize reduction on
misclassiﬁcation of
non-extremist content.
● Implications in a large scale
application.

● Domain Specific Knowledge plays critical role and importance of ground
truth for such complex problems.
● False alarms: significantly reduced via incorporation of three domain
specific dimensions. It further reduces the likelihood of an unfair
mistreatment towards non-extremist individuals, in a potential real world
deployment.
● Misclassification of non-extremist users can have significant implications
in a large-scale application where non-extremists vastly outnumber
extremists.
● Higher precision reduces potential social discrimination. 106
Key Insights

● Extremist users employ religion along with hate,
suggesting they employ different hate tactics for their
targets.
● Each dimension plays different roles in different levels of
radicalization, capturing nuances as well as linguistic and
semantic cues better throughout the radicalization
process.
107
Key Insights

Our Highly Multidisciplinary Approach
108
Public/
Society
Social
Interactions
Cognitive
Neuro
Cognitive
Process
● Human brain processes information from extremist
narratives on social media, that includes different
contexts, emotions, sentiment, etc.
● Individuals change behavior, make choices in
consuming/sharing content with an intent.
● Coordination, information ﬂow and diffusion on
social networks.
● Outcomes/impact on society through events and
collective actions (eg, civil war or result of an
election).
Neural

References
● Ugur Kursuncu. “Modeling the Persona in Persuasive Discourse on Social Media Using Context-aware and
Knowledge-driven Learning.” University of Georgia. 2018.
● Ugur Kursuncu, Manas Gaur, Carlos Castillo, Amanuel Alambo, Krishnaprasad Thirunarayan, Valerie Shalin,
Dilshod Achilov, I. Budak Arpinar, and Amit Sheth. "Modeling Islamist extremist communications on social
media using contextual dimensions: Religion, ideology, and hate." CSCW 2019.
● Ugur Kursuncu, Manas Gaur, and Amit Sheth. "Knowledge infused learning (K-IL): Towards deep
incorporation of knowledge in deep learning." Proceedings of the AAAI 2020 Spring Symposium on
Combining Machine Learning and Knowledge Engineering in Practice (AAAI-MAKE 2020).
● Ugur Kursuncu, Manas Gaur, Usha Lokala, Krishnaprasad Thirunarayan, Amit Sheth, and I. Budak Arpinar.
"Predictive analysis on Twitter: Techniques and applications." Springer Nature 2019.
● Ugur Kursuncu, Manas Gaur, Usha Lokala, Anurag Illendula, Krishnaprasad Thirunarayan, Raminta
Daniulaityte, Amit Sheth, and I. Budak Arpinar. "What's ur Type? Contextualized Classiﬁcation of User Types
in Marijuana-Related Communications Using Compositional Multiview Embedding." Web Intelligence 2018.
● Ugur Kursuncu, Manas Gaur, Krishnaprasad Thirunarayan, Amit Sheth, “Explainability of medical ai through
domain knowledge”, Ontology Summit Communications 2019.
109

Knowledge Graphs for Autonomous
Driving
Institute
ruwan@email.sc.edu
@ruwantw

Overview
111
Context
Understanding

Approach
112
Evaluation of Knowledge Graph Embeddings (KGEs) for the Automotive Driving Domain

Building the Knowledge Graph
113

Building the Knowledge Graph
114

Translating the KG into a KG Embedding
115

Translating the KG into a KG Embedding
116

Intrinsic Evaluation: Overview
117

Intrinsic Evaluation: KGs w/ various levels of
information
118

information
119

information
120

information
121
Intrinsic Evaluation - Results of Lyft

information
122
Intrinsic Evaluation - Results of NuScenes

References
● Ruwan Wickramarachchi, Cory Henson, and Amit Sheth. "An evaluation of knowledge graph embeddings for
autonomous driving data: Experience and practice." AAAI Spring Symposium 2020.
● Oltramari, Alessandro, Jonathan Francis, Cory Henson, Kaixin Ma, and Ruwan Wickramarachchi. "Neuro-symbolic
Architectures for Context Understanding." Knowledge Graphs for eXplainable AI, IOS Press 2020.
● Alessandro Oltramari, Cory Henson, Ruwan Wickramarachchi, Don Brutzman and Richard Markeloff. “Hybrid AI for
Context Understanding” 3rd U.S. Semantic Technologies Symposium, Raleigh, NC 2020
https://us2ts.org/2020/program-hybrid-ai
● Cory Henson, Stefan Schmid, Anh Tuan Tran, and Antonios Karatzoglou. "Using a Knowledge Graph of Scenes to
Enable Search of Autonomous Driving Data." In ISWC Satellites, pp. 313-314. 2019.
● Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan,
Giancarlo Baldan, and Oscar Beijbom. "nuscenes: A multimodal dataset for autonomous driving." In Proceedings of
the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11621-11631. 2020.
● Kesten, R., M. Usman, J. Houston, T. Pandya, K. Nadhamuni, A. Ferreira, M. Yuan et al. "Lyft level 5 av dataset 2019."
● Alshargi, Faisal, Saeedeh Shekarpour, Tommaso Soru, and Amit Sheth. “Metrics for evaluating quality of embeddings
for ontological concepts”. AAAI Spring Symposium 2019.
125

Knowledge-Infused NLP
for Understanding Content on DarkNet
Institute
shweta@knoesis.org
@shweta_yadav_3

Research Question
128
Does semantically enriching the natural language processing
algorithm with domain-specific knowledge increase the
coverage in text understanding?

Darknet and Anonymity
129
Access
Cryptomarkets provide anonymity
to both buyers and sellers:
• Location on the Dark Web, which
requires speciﬁc software to
access (e.g., Tor, I2P);
• Use of untraceable
cryptocurrencies (e.g., bitcoin);
• Privacy and anonymity;
Approximately two thirds of the
goods sold on cryptomarkets are
drugs (EMCDDA, 2018).

Transaction in Cryptomarket
130
Cryptocurrencies
• Based on centralized blockchain technologies
• Identiﬁed by encrypted code
• Approximately 1800 cryptocurrencies
• Most commonly used on cryptomarkets: Bitcoin,
Litecoin, Monero.
Image Source: 1. Zhang, Yiming, et al. "Your style your identity: Leveraging writing and photography styles for drug trafficker identification in darknet markets over attributed heterogeneous
information network." The World Wide Web Conference. 2019.
Image Source 2. https://www.investopedia.com/terms/b/blockchain.asp

Motivation
131
◉ Darknet markets have grown substantially even with government
interventions from 2013-2016 [1]
[1] Kristy Kruithof. 2016. Internet-facilitated drugs trade: An analysis of the size, scope and the role of the Netherlands. RAND.
Feature Growth
Total revenue 2x
Total number of transactions 3x
Total number of listings 5.5x
Total number of listings per vendor 2x
Incremental growth of the Darknet Market [1]

Motivation
132
◉ Drug Traﬃckers may maintain multiple accounts across different
markets or in the same market
◉ Linking different accounts to the same individuals is essential to track
their status and better understand the online drug traﬃcking ecosystem
◉ Illegal trading of drugs in these markets has turned into a serious global
concern because of its severe consequences on society (e.g., violent
crimes) and public health at regional, national and international levels

Snapshot of Darknet Market
133

Problem Statement
134
◉ The task involves the detection of similarity between two vendors on
online forums, i.e., Darknet, Reddit, and Twitter. (Identiﬁcation of sybil
accounts)
◉ Formally, given any two vendors va
and vb
associated with the
respective sites si
and sj
, our goal is to develop a similarity measure
sim(va
si
, vb
sj
) between the two vendors using various
characteristics/patterns.

Dataset Creation
135
◉ Data extracted using eDarkTrends platform [5] with 1992 unique vendors
collected over 3 different sites.
[5] Usha Lokala, Francois R Lamy, Raminta Daniulaityte, Amit Sheth, Ramzi W Nahhas, Jason I Roden, Shweta Yadav, and Robert G Carlson. 2019. Global trends, local harms:
availability of fentanyl-type drugs on the dark web and accidental overdoses in Ohio. Computational and Mathematical Organization Theory 25, 1 (2019), 48–59.
Dark Web Sites Dream Market Tochka Wall street All
Unique # Vendor names 1448 408 466 1992
Unique # Substance 852 313 290 1148
Unique # Location 356 44 29 389
Unique # Descriptions 16800 1829 1723 18472

Methodology: Modelling Multi-view Learning
137
◉ Multi-view learning is an ideal learning
mechanism for the data where examples are
characterized by distinct (often orthogonal)
feature sets.
◉ Generalize and improve the performance by
exploiting the diverse views from multiple rich
sources such as textual, stylometric, and location
representation.
Image Source: 1. Zhang, Yiming, et al. "Your style your identity: Leveraging writing and photography styles for drug trafficker identification in darknet markets
over attributed heterogeneous information network." The World Wide Web Conference. 2019.

Summary of Approach: eDarkFind
138

Knowledge Infusion: Drug Abuse Ontology
139
◉ The Drug Abuse Ontology (DAO) is a formal
representation of concepts and relationships
between them for the prescription drug abuse
domain.
◉ The current DAO contains 241 classes and 37
properties.
◉ DAO identify all variants of a concept in data
(e.g., generic names, slang terms, scientiﬁc
names).
◉ DAO contains names of psychoactive
substances (e.g., heroin, fentanyl), including
synthetic substances (e.g., U-47,700, MT-45),
brand and generic names of pharmaceutical
drugs (e.g., Duragesic, fentanyl transdermal
system) and slang terms (e.g., roxy, fent).

140
Augmentation of drug slang terms enables understanding of Drug Abuse-related
textual description that was not explored well at all.

141
◉ DAO contains information regarding the
route of administration (e.g., oral, IV), unit
of dosage (e.g., gr, gram, pint, tablets),
physiological effects (e.g., dysphoria,
vomiting) and substance form (e.g.,
powder, liquid, hcl)
◉ The DAO is also enriched with links to
concepts in external ontologies, through a
very careful manually supervised process.
Among the 43 DAO classes, 11 classes
have been mapped to URIs in DrugBank,
Freebase, DBpedia and the Cyc ontologies,
using the sameAs property.

142

Location and Substance View Encoding
144
◉ Utilize simple binary encoding to obtain the view representation:
◉ Add a self information weight or information content, for all features
Information content
USA CAN ESP IND CHN BEL NOR NZL SAU UKR
1 1 0 0 0 0 0 0 0 0

Multi-view Fusion-Canonical Correlation
Analysis
145
◉ Cannot simply concatenate since each vector
may correspond to different modalities (image vs
text) or very different distributional properties
◉ These views are fused using CCA [9] to obtain a
single representation, which we call Vendor
embedding
◉ Allows us to infer information from cross
variance matrices
◉ Employ an extension called weighted generalized
CCA.
[9] Harold Hotelling. 1992. Relations between two sets of variates. In Breakthroughs in statistics. Springer, 162–190.

Results
146
Performance metric of our model on different datasets
Highest average
accuracy across all
datasets

Results: Ablation Study
147Performance metric of various models on All sites combined.
Best performance

Domain Speciﬁc Analysis
148
◉ Usage of Multilingual and Code-mixed text
◉ Use slang terms across listings captured by our model (e.g., horse for heroin)
◉ Lack of uniform features in website adds noise to our model (product description
and rating data)
◉ Some vendors may operate from different locations or may even be selling
different drugs
◉ Branding (posting favorable reviews) is common in these markets

Use Case Examples
149
Case Studies @Vendor 1 @Vendor 2
Branding 5//02/14 09:49 am,5/Thanks alles
schick/11/10 01:46 pm, <END>Tilidin
50MG/4MG Original Apothekenware
5//02/14 09:49 am,5/Thanks alles schick/11/10
01:46 pm, <END>Tilidin 50 MG/4MG Original
Apothekenware <END> 5/Thanks alles
schick/11/10 01:46 pm,
Comparing product
Description and rating
since the vendor did not
enter product description
in other site.
Percocet Oxycodone 5/325 mg 200 Tablets
Finalize Early and get 20 Free bonus sent for a
total of 220!US Made Mallinckrodt 5mg/325
(made in St. Louis, Miss. USA) ...
5//02/07 01:03 pm,5/Thanks Again. A++/01/21
11:49 pm,5/Trustworthy/01/16 12:22
pm,4.33//01/07 08:50 am,5/Great
communication, trustworthy, and over
delivered./12/31 11:09 pm,5//11/29 03:25
pm,5/FAST A+++ Best Stealth I’ve seen yet.
Similar stylometric
Features captured by
the use of special
characters or emojis.
—————————————
—————————————
****** NEWS 25.12.2018 NEWS ******
—————————————
—————————————
We ship all new ...
—————————
—————————
PRODUCTS
—————————
—————————
AFGHAN HEROIN
A+++COCAINE #3 ...

Conclusion
150
◉ 98% ACCURACY: Developed Multi-view learning Sybil account detection on
the real-life Darknet market dataset achieving an accuracy of 98%.
◉ UNSUPERVISED LEARNING: Utilizing unlabelled data to train the network.
◉ DOMAIN ADAPTATION: Performed cross-domain analysis to justify uniform
result.
◉ KNOWLEDGE-INFUSED NLP: Proved the effectiveness of utilizing domain
speciﬁc knowledge graph of drug (DAO) in textual content understanding on
DarkNet.

References
● Ramnath Kumar, Shweta Yadav, Raminta Daniulaityte, Francois Lamy, Krishnaprasad Thirunarayan, Usha Lokala, and Amit
Sheth. "eDarkFind: Unsupervised Multi-view Learning for Sybil Account Detection." The Web Conference. 2020.
● Zhang, Yiming, et al. "Your style your identity: Leveraging writing and photography styles for drug traﬃcker identiﬁcation in
darknet markets over attributed heterogeneous information network." The World Wide Web Conference. 2019.
● Delroy Cameron, Gary A Smith, Raminta Daniulaityte, Amit P Sheth, Drashti Dave, Lu Chen, Gaurish Anand, Robert Carlson, Kera
Z Watkins, and Russel Falck. 2013. PREDOSE: a semantic web platform for drug abuse epidemiology using social media.
Journal of biomedical informatics 46, 6 (2013), 985–997
● Usha Lokala, Francois R Lamy, Raminta Daniulaityte, Amit Sheth, Ramzi W Nahhas, Jason I Roden, Shweta Yadav, and Robert G
Carlson. 2019. Global trends, local harms: availability of fentanyl-type drugs on the dark web and accidental overdoses in Ohio.
Computational and Mathematical Organization Theory 25, 1 (2019), 48–59.
● Xiangwen Wang, Peng Peng, Chun Wang, and Gang Wang. 2018. You are your photographs: Detecting multiple identities of
vendors in the darknet marketplaces. In Proceedings of the 2018 on Asia Conference on Computer and Communications
Security. ACM, 431–442.
151

Ongoing Research at #AIISC
152
Detection of Early Onset of Colorectal Cancer using
Digestive Inﬂammation Index
Conversational Systems for Nutrition Monitoring of High
School Children
Cyber Social Threats
Conversational Systems for pediatric patients with Neutropenia,
asthma in children, and obesity and hypertension in adults.
Development of an Instrumented, Intelligent Infant Interaction
Laboratory for the Prediction of Autism Spectrum Disorder
Current Collaboration across UofSC:
● College of Medicine (>5)
● College of Nursing (2)
● College of Arts & Science (2)
● College of Pharmacy (2)
● College of Information &
Communication
● College of Engineering &
Computing
● College Education

AIISC and Collaborators
153
5 faculty, >12 PhDs, few Masters, >5
undergrads, 2 Post-Docs, >10 Research
Interns
Alumni in/as
Industry: IBM T.J. Watson, Almaden, Amazon, Samsung
America, LinkedIn, Facebook, Bosch
Start-ups: AppZen, AnalyticsFox, Cognovi Labs
Faculty: George Mason, University of Kentucky, Case
Western Reserve, North Carolina State University,
University of Dayton

http://aiisc.ai/
We acknowledge partial support from the National Science Foundation (NSF) award
CNS-1513721: “Context-Aware Harassment Detection on Social Media", National
Institutes of Health (NIH) award: MH105384-01A1: “Modeling Social Behavior for
Health- care Utilization in Depression", and National Institute on Drug Abuse (NIDA)
Grant No. 5R01DA039454-02 “Trending: Social media analysis to monitor cannabis
and synthetic cannabinoid use”. Any opinions, conclusions or recommendations
expressed in this material are those of the authors and do not necessarily reﬂect the
views of the NSF, NIH, or NIDA.

ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep Learning

More Related Content

What's hot

Similar to ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep Learning

Recently uploaded

ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep Learning