Successfully reported this slideshow.

ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep Learning

1

Share

Upcoming SlideShare
Knowledge-infused AI
Knowledge-infused AI
Loading in …3
×
1 of 154
1 of 154

ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep Learning

1

Share

Download to read offline

We hosted a fantastic tutorial on Knowledge-infused Deep Learning at the 31st ACM Hypertext Conference on July 14. Broadly, the tutorial covered many exciting applications of Broad- and Community-based Knowledge Graph in Education, Clinical and Social-Media Healthcare, Pandemic, and Cryptomarkets.
We theorized the concept of Knowledge-infusion and showed its importance in gaining explainability and spectacular performance gains. We extended the idea of "Knowledge-infused Deep Learning" to Autonomous Driving, Cyber Social Harms, and DarkWeb.
The tutorial presentation with relevant resources and references are made online at http://kidl2020.aiisc.ai.

We hosted a fantastic tutorial on Knowledge-infused Deep Learning at the 31st ACM Hypertext Conference on July 14. Broadly, the tutorial covered many exciting applications of Broad- and Community-based Knowledge Graph in Education, Clinical and Social-Media Healthcare, Pandemic, and Cryptomarkets.
We theorized the concept of Knowledge-infusion and showed its importance in gaining explainability and spectacular performance gains. We extended the idea of "Knowledge-infused Deep Learning" to Autonomous Driving, Cyber Social Harms, and DarkWeb.
The tutorial presentation with relevant resources and references are made online at http://kidl2020.aiisc.ai.

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

ACM Hypertext and Social Media Conference Tutorial on Knowledge-infused Deep Learning

  1. 1. Knowledge-infused Deep Learning Artificial Intelligence Institute Tutorial Amit Sheth, Manas Gaur, Ugur Kursuncu, Ruwan Wickramarachchi, Shweta Yadav Artificial Intelligence Institute, University of South Carolina, USA Check Tutorial site for latest slides: http://kidl2020.aiisc.ai/
  2. 2. Tutorial Thesis 2 Broad Vision How do you make a system more intelligent? Without Domain Knowledge With Domain Knowledge Motivational Interviewing
  3. 3. 3 How to gain deep understanding of the content? Tutorial Thesis 3 agitation nervous panicky >Millions Social Media Deep Clustering Neural Parsing Repeated panic attacks agitation nervous panicky Repeated panic attacks anxiety KG Sleep Disorder Circadian Rhythm Disorder Context understanding Shallow Semantics Deeper Semantics [Lin 2020, Kitaev 2018] https://github.com/facebookresearch/deepcluster [Gaur 2020]
  4. 4. Knowledge in computing 4 The role of knowledge in computing has long been recognized - at least since Vannever Bush’s 1945 seminal piece: As We May Think. Enhanced (semantic) applications such as search, browsing, personalization, recommendation, advertisement, and summarization. Improve integration of data, including data of diverse modalities and from diverse sources. Empower/enhance ML and NLP techniques. Use as a knowledge transfer mechanism across domains, between humans and machines Improve automation and support intelligent human-like behavior and activities that may involve conversations or question-answering and robots. ~2000 ~2025 Focus: From small data to big data. Data alone is not enough. [Domingos 2012]. Knowledge will propel machine understanding of content. [Sheth, et al. 2017]
  5. 5. Tutorial Thesis 5 Interpretability + Traceability → Explainability Ethics, Bias, and False Alarms Deeper Understanding of Content including Context Understanding What is the right knowledge graph to use? [Semantic, Cognitive, Perceptual Computing, Sheth 2015] Structured and Unstructured Data Models Knowledge Graph/Base Compute Application/ Workflow David Cox Talk: Neurosymbolic AI F. Lécué: On the Role of Knowledge Graphs in Explainable AI A Machine Learning Perspective
  6. 6. About the Tutorial 66 All About the Knowledge Graphs Knowledge-infused Deep Learning Knowledge-infusion: Cyber Social Threats Knowledge-infusion: Autonomous Driving Knowledge-infusion: DarkNet HT-2020 Tutorial: A. Sheth, M. Gaur, U. Kursuncu, R. Wickramarachchi, & S. Yadav Knowledge-infused Deep Learning
  7. 7. All About Knowledge Graphs Artificial Intelligence Institute Amit Sheth amit@sc.edu @amit_p
  8. 8. Why now? 8
  9. 9. Definition Knowledge Graphs (KG) is a structured knowledge in a graph representation (in many cases, labeled property graph, or RDF or its variants). We cannot escape the class expressivity-computability Tread-off. Community is still debating exact definition. Key differentiator: Relationships (“relationships at the heart of semantics”). Different/Related forms: ● Ontology : Knowledge graph after human curation of entities and relations; “ontological commitment”, richer KR ● Knowledge Base: flattened graph ● Lexicons: Small application-specific flattened graph ● Knowledge Networks (KN) integrate and combine knowledge (usually captured as KGs) to serve a network (community); could be from and service multiple domains. 9Knowledge Graphs and Knowledge Networks: The Story in Brief
  10. 10. Expressiveness Range: Knowledge Representation and Ontologies Catalog/ID General Logical constraints Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal is-a Formal instance Value Restriction Disjointness, Inverse, part of… Simple Taxonomies Expressive Ontologies Wordnet CYC RDF DAML OO DB Schema RDFS IEEE SUOOWL UMLS GO GlycOSWETO Pharma Ontology Dimensions After McGuinness and Finin KEGG TAMBIS BioPAX EcoCyc
  11. 11. Ontology Examples Commonsense Reasoning Graph Drug Abuse Ontology Event Ontology Crisis Ontology
  12. 12. Creation & Use of Knowledge ~2000 First commercial semantic search/browsing/… on the Web and for the content on the Web using KG. Term used for KR: WorldModel, Ontology
  13. 13. Proliferation Broad-based & Domain-Specific KGs 13 Examples of General Purpose Knowledge Graphs 1. DBpedia [Auer 2007, Lehmann 2015] 2. Yago [Rebele 2016] 3. Freebase [Bollacker 2008] 4. ConceptNet [Speer 2017] 5. Knowledge Vault [Dong 2014] 6. NELL [Mitchell 2018] 7. Wikidata [Vrandečić 2014] Example of Healthcare-specific Knowledge Graphs 1. SNOMED-CT [ACL Chang 2020] 2. Unified Medical Language System (UMLS) [Yip 2019] 3. DataMed [JAMIA Chen 2018] 4. International Classification of Diseases (ICD-10) [JAMIA Choi 2016] 5. DrugBank, Rx-NORM and MedDRA [ BMC Celebi 2019] 6. Drug Abuse Ontology [BMI Cameron 2013] Many are also community-developed.
  14. 14. Variety of Sources for Large-scale KG and in Different Representation 14 Linked Open Data (LOD) Schema.org (schema.org) Data Commons Knowledge Graph (DCKG) (datacommons.org) Wikidata (wikidata.org) https://github.com/data commonsorg/api-python https://dumps.wikimedia.org/wikid atawiki/entities/ https://lov4iot.appspot.com/ https://github.com/schemaorg/schem aorg
  15. 15. Domain-specific knowledge extraction from LOD Linked Open Data (LOD) Book related information? Filter relevant datasets Extract relevant portion of a data set Project Gutenberg DBpedia DBTropes Books, Countries, Drugs Books, movie, games Books Book specific DBpedia Book specific DBTropes Lalithsena, Sarasi, et al. "Automatic domain identification for linked open data." Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2013 IEEE/WIC/ACM International Joint Conferences on. Vol. 1. IEEE, 2013. Sarasi Lalithsena, Pavan Kapanipathi and Amit Sheth "Harnessing relationships for domain-specific subgraph extraction: A recommendation use case." Big Data (Big Data), 2016 IEEE International Conference on. IEEE, 2016.
  16. 16. Enterprise Knowledge Graphs are also very popular 16 KG enabled Web and Enterprise Applications: Google, Amazon, Microsoft, Siemens, LinkedIn, Airbnb, eBay, and Apple, as well as smaller companies (e.g. ezDI, Franz, Metaphactory/ Metaphacts, Semantic Web Company, Mondeca, Stardog, Diffbot, Siren). Enterprise KG development service is also available. (Maana). Industry-Scale Knowledge Graphs: Lessons and Challenges (Communications of the ACM, August 2019)
  17. 17. Why Knowledge Graphs: Shortcomings of Deep Learning (DL) 17 Trivial Case for Classification Text: I sometimes wonder how many alcoholics are relapsing under the lockdowns (former alcoholic). Question: Does the person has addiction? Answer: Yes Not Trivial Text: Then others that insisted that what I have is depression even though manic episodes aren't characteristic to depression. I dread having to retread all this again because the clinic where I get my mental health addressed is closing down due to loss in business caused by the pandemic Question: Does the person suffer from depression? Answer: Yes Correct: No Disjunctive Questions Question: Are you feeling nervous or anxious or on edge? Question: Is the feeling of restlessness due to stress or anxiety ? Questions: Does an employee own a company or work for a company? Research in this directions: Query2Box and Multi-hop Reasoning [Ren ICLR 2020, Lin EMNLP 2018] Covid context Generic context Bottom line: Most state of the art Deep learning approaches are not integrated with prior knowledge. This tutorial is about strategies for doing so. [David Cox] [Marcus 2018] https://www.digitaltrends.com/cool-tec h/neuro-symbolic-ai-the-future/
  18. 18. Why Knowledge Graphs: Shortcomings of DL ● Graph Convolutional Neural Networks (GCN) are blind to relation types. For example: <shelter-in-place causes anxiety> and <shelter-in-place prevents anxiety> have similar representations in GCN. ● Deep Clustering over unlabeled data exploits the inherent latent semantics to generate diverse and cohesive clusters. But, interpretability of the clusters requires Knowledge Graphs. 18 ODKG: Opioid Drug Knowledge Graph [Kamdar 2019]
  19. 19. Why Knowledge Graphs : NLP/NLU Challenges 19 ● Natural Language Processing Challenges: ○ How do you learn quickly from small amount of data? ○ How do you mine (varied) relationships from existing text? ○ How do you reliably classify entities into known ontology? ○ Better contextualization of words ● Natural Language Understanding Challenges: ○ Query Interpretation or Understanding the user question ○ Answering the question with Trust and Transparency ○ How to measure “reasonability” and “meaningfulness” of the response to a question? ○ How much context is needed to provide a precise response? [Stanford Knowledge Graph Seminar 2020, Amit Prakash , Leilani Gilpin]
  20. 20. 20 [Image from Talukdar] KG in Conversational AI ● Get same/similar answers based on trusted knowledge ● Personalization ● Contextualization
  21. 21. Personalization: taking into account the contextual factors such as user’s health history, physical characteristics, environmental factors, activity, and lifestyle. Chatbot with contextualized (e.g asthma) knowledge is potentially more personalized and engaging. Without Contextualized Personalization With Contextualized Personalization KG in Conversational AI
  22. 22. 22 How do we use Knowledge Graphs? Health Knowledge Graph [Shah and Sheth US patent 2015]
  23. 23. 23 How do we use Knowledge Graphs?
  24. 24. Semantic Proximity GBV Index GBV estimation for 14 days GBV Lexicon from Tweets on bullying, abuse. Domestic violence, etc. Mapping words to categories for expansion of lexicon Generic Knowledge Graph of Wikipedia Aligning the lexicon words and new entities with respect to DBpedia Categories Enriched Lexicon for gathering abstract meaning of GBV in tweets Calculating cosine similarity between two vectors (GBV and Tweets) and setting empirical threshold on semantic proximity Mental Health Tweets From March 14-April 04 Analyzing Gender-based Violence (GBV) in Mental Health COVID-19 Twitter Conversation How do we use Knowledge Graphs? Maximum A Posteriori Estimation (MAP) Purohit, Hemant, Tanvi Banerjee, Andrew Hampton, Valerie L. Shalin, Nayanesh Bhandutia, and Amit P. Sheth. "Gender-based violence in 140 characters or fewer: A# BigData case study of Twitter." arXiv preprint arXiv:1503.02086 (2015). Psychidemic https://www.youtube.com/watch? v=XzYrn0PEzNk
  25. 25. Assessing Mental Health Impact of COVID using News Articles How do we use Knowledge Graphs? https://theconversation.com/were-measuring-online-conversation-to-track-the-social-and-mental-health-issues-surfacing-during-the-coronavirus-pandemic-135417 Multilingual KG http://conceptnet.io/ GDelt Database https://www.gdeltproject.org/
  26. 26. 26 Understanding City Traffic Events: Role of KG in analyzing multimodal data Anantharam, Pramod, Payam Barnaghi, Krishnaprasad Thirunarayan, and Amit Sheth. "Extracting city traffic events from social streams." ACM Transactions on Intelligent Systems and Technology (TIST) 6, no. 4 (2015): 1-27.
  27. 27. 27 Drug Abuse Ontology in PREDOSE owl:thing prescription _drug_ brand_name brandname_ undeclared brandname_ composite prescription _drug monograph_ ix_class cpnum_ group prescription _drug_ property indication_ property formulary_ property non_drug_ reactant interaction_ property property formulary brandname_i ndividual interaction_ with_prescri ption_drug interaction indication generic_ individual prescription _drug_ generic generic_ composite interaction_ with_non_ drug_reactant interaction_with_mo nograph_ix_class
  28. 28. 28 PREDOSE Cameron, Delroy, Gary A. Smith, Raminta Daniulaityte, Amit P. Sheth, Drashti Dave, Lu Chen, Gaurish Anand, Robert Carlson, Kera Z. Watkins, and Russel Falck. "PREDOSE: a semantic web platform for drug abuse epidemiology using social media." Journal of biomedical informatics 46, no. 6 (2013): 985-997.
  29. 29. 29 Knowledge Graph in Education Educational Knowledge Base Enable expert system to answer questions like: 1. What to study when there is less time? 2. How to set a good question paper? 3. How to cover-up learning gaps from previous years? 4. How to connect 8th grade with 10th grade science? Mentor Intelligence mimics teacher thinking Intelligent Content Authoring and Curation 1. Granularity 2. Personalization (student or institution) 3. Robustness 4. Interventional: diagnosis and remedy
  30. 30. 30 Named Entity Recognition Relationship Extraction Entity Linking Implicit information extraction Implicit Entity Linking using KG and Conditional Random Fields Perera, Sujan, Pablo N. Mendes, Adarsh Alex, Amit P. Sheth, and Krishnaprasad Thirunarayan. "Implicit entity linking in tweets." In European Semantic Web Conference, pp. 118-132. Springer, Cham, 2016.
  31. 31. 31 Experiences or Factual Knowledge Abstract Knowledge 1. Continuum of Knowledge 2. relationship mapping: NLU through and knowledge transfer across domains Analogical Generalization Applicable to new situation via analogy [Forbus and Gentner 1997, Gentner and Medina 1998] Mapping between Enzyme Kinetics and Musical Chairs [Ongoing] Mapping between two Conceptual Frames in similar domain (Physics)
  32. 32. ML and Knowledge Graphs: Pipeline 32 Knowledge Extraction Knowledge Alignment Knowledge Cleaning Knowledge Mining & Knowledge-based QA Data Extraction (NLP, Web) Wrapper Induction (DB, DM-Data Mining) Web Tables (DB) Text Mining (DM) Entity and Relationship Linking [Perera 2016] Schema Mapping and Ontology Mapping [Jain 2010] Universal Schema [Sheth 1990] Data Cleaning [Jadhav 2016] Anomaly Detection [Anantharam 2012, 2016] Knowledge Fusion [Sheth 2020, Kapanipathi 2020, Gaur 2018, Kursuncu 2020] Graph Mining [Lalithsena 2016, 2017, 2018] Knowledge Embedding [Wickramarachchi 2020, Gaur 2018] Search [Sheth 2003, Cheekula 2015, Kho 2019] QA [Alambo 2019, Shekarpour 2017] [Stanford Knowledge Graph Seminar 2020, Luna Dong]
  33. 33. 33 More Applications and Domains that use KG+DL Pharmacy [Futia 2020, Gentile 2019] Personalized mHealth [ Sheth 2017, 2018a, 2018b, 2019] Public Health [Yazdavar 2017, Gaur 2018, Daniulaityte 2016] Question Answering/ Dialog System [Alambo 2019, Shekarpour 2017, 2015 ] Hypothesis Generation to find association between Stress and Colorectal cancer [ Cameron 2015] Chatbot with contextualized (e.g asthma) knowledge is potentially more personalized and engaging.
  34. 34. 34 Knowledge-infused Learning Methods (Internet Computing’19, AAAI’20, CIKM’18, WWW’19, NAACL’18, ACL’17, ….) Where are we [Stanford Knowledge Graph Seminar 2020, Christopher Re]
  35. 35. 35 What is Knowledge infusion? Why do we need it? What is Knowledge-infused Learning? What are the different types of Knowledge-infused Learning? How can Knowledge-infused Learning provide solutions to complex problems: Unstructured Healthcare on Social Media Radicalization on Social Media Autonomous Driving Vehicles Drug Trafficking in Cryptomarkets Questions we address next
  36. 36. 36 Vision: KG as Glue in Developing Hybrid AI Systems STATISTICAL AI CONNECTIONIST “Unreasonable effectiveness of big data” in machine processing & powering bottom up processing “Unreasonable effectiveness of small data” in human decision making - can this be emulated to power top down processing? SYMBOLIC AI FORMAL KG will play an increasing role in developing hybrid neuro-symbolic systems (that is bottom-up deep learning with top-down symbolic computing) as well as in building explainable AI systems for which KGs will provide scaffolding for punctuating neural computing. Cognitive Science Analogy: Combining Top Brain - Bottom Brain Processes.
  37. 37. Knowledge-infused Deep Learning (KiDL) Artificial Intelligence Institute Manas Gaur mgaur@email.sc.edu @manasgaur90
  38. 38. 38 ● Ambiguous online healthcare communications and difficult to engineer discriminative features. ● Domain-specific embedding models provide a shallow infusion of knowledge. ● Decrease the dependence on large datasets ● Reduce bias in the dataset (ie: potentially avoid social discrimination and unfair treatment) ● Provide information provenance: Allowing explainability of a model ● Improve information coverage specific to a domain that would be missed otherwise ● Reduce time and space complexity of the models architecture ● Improve models sensitivity and specificity ● Explainability Why Knowledge-infused Deep Learning ?
  39. 39. 39 Deep NLP Requires Background Knowledge An excessive endogenous or exogenous stimulations by estrogen induces adenomatous hyperplasia of the endometrium ● adenomatous modifies hyperplasia ● An excessive endogenous or exogenous stimulations modifies estrogen ● “adenomatous hyperplasia” and “endometrium” occurs as “adenomatous hyperplasia of the endometrium” MeSH Terms in PubMed Articles [Ramakrishnan 2008] [Gaur 2018, 2019, Limsopatham 2016 ]
  40. 40. 40 Gkotsis, George, Anika Oellrich, Tim Hubbard, Richard Dobson, Maria Liakata, Sumithra Velupillai, and Rina Dutta. "The language of mental health problems in social media." In Proceedings of the Third Workshop on Computational Linguistics and Clinical Psychology, 2016. NLP Requires Background Knowledge
  41. 41. 41 How do you know that a training set has a good domain coverage? How do ensure consistency of labeling, esp when label is not binary? Do labels represent adequate semantics (e.g., number of alternatives)? Do they have adequate domain knowledge? How do you ensure consistency of labeling (interpretation)? Questions
  42. 42. 42 Weak → Distant → Knowledge-infused
  43. 43. 43 Weak Supervision using SNORKEL https://www.snorkel.org/ [Ratner 2017, Bach 2019]
  44. 44. 44 SEMANTIC & KNOWLEDGE GRAPH Top-down symbolic approach (concepts, rules) in data reasoning, inferencing, and deduction. MACHINE/ DEEP LEARNING Bottom-up statistical approach in searching, analyzing and deriving insights from Big Data. In Abstract Sense
  45. 45. 45 Theoretically Why KiDL: Probably Approximately Correct Learning Valiant, Leslie G. "Robust logics." Artificial Intelligence 117.2 (2000): 231-253.
  46. 46. 46 Theoretically Why KiDL: Probably Approximately Correct Learning How do you know that a training set has a good domain coverage? Robust Classifier → Low Generalizability Error Consistent Classifier → Low Training Error Confidence: More Certainty (lower δ) means more number of samples. Complexity: More complicated hypothesis (|H|) means more number of samples
  47. 47. 47 K-IL: “The exploitation of domain knowledge and application semantics to enhance existing deep learning methods by infusing relevant conceptual information into a statistical, data-driven computational approach (Neuro-Symbolic AI).” A. Sheth, M. Gaur, U. Kursuncu and R. Wickramarachchi, "Shades of Knowledge-Infused Learning for Enhancing Deep Learning," in IEEE Internet Computing, vol. 23, no. 6, pp. 54-63, 1 Nov.-Dec. 2019, doi: 10.1109/MIC.2019.2960071.
  48. 48. 48 of knowledge graphs to improve the semantic and conceptual processing of data. SEMI-DEEP Infusion Deeper and congruent incorporation or integration of the knowledge graphs in the learning techniques. DEEP Infusion (Part of Future KG Strategy) combines statistical AI (bottom-up) and symbolic AI learning techniques (top-down) for hybrid and integrated intelligent systems. SHALLOW Infusion Taxonomy of Knowledge Infusion
  49. 49. 49 Shallow external knowledge is described as those form of information which are extracted from text based on some heuristics, often designed for task-specific problems: ○ Bag of Words/Phrases from Corpus [Hagoort 2004, Zhang 2019, Sun 2019] ○ Bag of Words/Phrases from Semantic Lexicons [Faruqui 2014, Mrkšić 2016] ○ Count of Nouns, Pronouns, Verbs [Gkotsis 2017, 2016] ○ Sentiment and Emotions of the sentence [Gaur 2019, Vedula 2017, Kursuncu 2019] ○ Latent topics describing the documents [Jiang 2016, Li 2016, Meng 2020] ○ Label assignment to words or phrases in sentence (Semantic Role Labeling): Shallow Infusion Mary sold the book to John Agent ThemePredicate Recipient
  50. 50. 50 Knowledge: Domain specific large corpora Knowledge: pre-trained embeddings + semantic lexicons Knowledge: Domain specific large corpora Word2Vec Retrofitting BERT “Context is represented by a set of words for a given target word” “Learned embeddings are further enriched by using semantic lexicons” “Uses language modeling objective to learn the contextual representations” Examples of Shallow Infusion
  51. 51. 51 Chronological arrangement of shallow Infusion techniques From NLP domain Shallow Infusion of Knowledge in Deep Learning: In Brief
  52. 52. 52 Shallow Infusion: Retrofitting Example 52 damage Infrastructure affected population damage Infrastructure affected population Vector representation of words in Tweets before retrofitting Vector representation of words in Tweets after retrofitting MOAC Ontology Empathi ontology Disaster Ontology DBpedia Gaur, Manas, et al. "empathi: An ontology for Emergency Managing and Planning about Hazard Crisis." 2019 IEEE 13th International Conference on Semantic Computing (ICSC). IEEE, 2019.
  53. 53. 53 SuicideWatch Subreddit (93K Users) NYC CDRN EHR (123K patients) Data specific to Mental Health Medical Knowledge Bases We identified self-harm, depressive feelings, and suicide ideations as latent topics expressed in Reddit and EHR data. Both sources did not provide evidence of mentions or expressions of impulsivity, family violence, and drug abuse. Shallow Infusion: Association between Social Media and EHR in Suicide-related Communications [Gaur, Psychiatry Under Review 2020]
  54. 54. 54 Semi-Deep Infusion In semi-deep infusion, external knowledge is involved through attention mechanism or learnable knowledge constraints acting as a sentinel to guide model learning. ➢ External Knowledge through Attention ➢ External Knowledge through Learnable Constraints
  55. 55. 55 Tacit Knowledge Self-aware or External Knowledge Similarity based verification Semi-Deep Infusion Dataset Deep Learning Model Dataset enrich Deep Learning Model Tacit Knowledge Hypothesis testing or similarity-based verification Shallow Infusion Self-aware or External Knowledge Comparing Semi-Deep Infusion with Shallow Infusion Sheth, Amit, Manas Gaur, Ugur Kursuncu, and Ruwan Wickramarachchi. "Shades of Knowledge-Infused Learning for Enhancing Deep Learning." IEEE Internet Computing 23, no. 6 (2019): 54-63.
  56. 56. 56 A neural attention mechanism equips a neural network with the ability to focus on a subset of its inputs (or features): ○ Hard Attention or Position specific attention : location of important entities and relationship in the text are hard-coded in the model. Thus allowing efficiency in feature engineering, however, the model suffer from exposure bias. ○ Soft Attention: The model learns to attend to specific parts of the text while generating the word describing that part (following distributional semantics). ○ Attention with Knowledge base: background knowledge is integrated using an attention mechanism, which decide whether to attend to background knowledge and which information from KBs is useful. External Knowledge through Attention
  57. 57. 57 ● Learnable constraints are empirical thresholds (probabilistic value) learnt by the model which allows it to adaptively learn. ● It can be done in following ways: ○ Learning based on pre-structured axiomatic rules - axiomatic knowledge ○ Learning based on difference in content similarity - KL Divergence, Cross-entropy loss ○ Learning based on commonsense knowledge - ConceptNet ○ Learning over different permutations of text generated through synonyms, antonyms, and homonyms. External Knowledge through Learnable Constraints
  58. 58. 58 Methods for Semi-Deep Infusion
  59. 59. 59 _______ meant to ______ not to ______ Template: fill in the blanks It was meant to dazzle not to make sense Target: Generative Model It was meant to dazzle not to make it Infilling Content Matching through averaged KL Divergence Learnable knowledge constraint module Learnable Constraints Hu, Zhiting, Zichao Yang, Russ R. Salakhutdinov, L. I. A. N. H. U. I. Qin, Xiaodan Liang, Haoye Dong, and Eric P. Xing. "Deep generative models with learnable knowledge constraints." In Advances in Neural Information Processing Systems, pp. 10501-10512. 2018. Replace the sentence with KG or Resource
  60. 60. Semi-Deep Infusion : KG GANs Generative Adversarial Network* *Chang, Che-Han, Chun-Hsien Yu, Szu-Ying Chen, and Edward Y. Chang. "KG-GAN: Knowledge-Guided Generative Adversarial Networks." arXiv preprint arXiv:1905.12261 (2019). Seen Category Data UnSeen Category Data Generator (G1 ) Generator (G2 ) Z1 Z2 Real Data Fake Data (G1 ) Fake Data (G2 ) Discriminator (D) Embedding Regression Network Semantic Embedding of Unseen Category Prediction (G2 ) Prediction (G1 ) ≅ Parameter Sharing Loss (G1 ) Loss (G2 ) Real or Fake Objective Function
  61. 61. Variants: 1. Knowledge base at each LSTM cell [1]. 2. K-IL layer [2]: a. 1D Convolutional Neural Network for mixing b. Graph Convolutional Neural Network -- When hierarchical structure of KG is important and need to be preserved in representation. c. Simple Multi-layer Perceptron. [1] Yang, Bishan, and Tom Mitchell. Leveraging knowledge bases in lstms for improving machine reading. arXiv preprint arXiv:1902.09091 (2019). [2] Ugur Kursuncu, Manas Gaur, and Amit Sheth. Knowledge Infused Learning (K-IL): Towards Deep Incorporation of Knowledge in Deep Learning. Proceedings of the AAAI 2020 Spring Symposium on Combining Machine Learning and Knowledge Engineering in Practice (AAAI-MAKE 2020). Semi-Deep Infusion : LSTMs
  62. 62. 62 Deep Infusion (Vision) Ugur Kursuncu, Manas Gaur, and Amit Sheth. Knowledge Infused Learning (K-IL): Towards Deep Incorporation of Knowledge in Deep Learning. Proceedings of the AAAI 2020 Spring Symposium on Combining Machine Learning and Knowledge Engineering in Practice (AAAI-MAKE 2020).
  63. 63. K-IL : Objective Functions and Evaluation Kullback Leibler Divergence ● Measures the Information loss during the learning phase between Latent/hidden states and KGs ● KG Embeddings: TransE, HoIE etc. ● Models: Variational Autoencoders, LSTMs, GANs, Siamese Neural Networks ● Frameworks: Zero Shot Learning , One Shot Learning, Transfer Learning, Parameter Sharing ● Other Variants: Jensen Divergence, Regularization, Integer Linear Programming Kosheleva, Olga, and Vladik Kreinovich. "Why deep learning methods use KL divergence instead of least squares: a possible pedagogical explanation." Математические структуры и моделирование 2 (46) (2018). Evaluation: Before and After Knowledge-infusion Methods (Apart from Precision, Recall, F1-score): ● Frechet Inception Distance : measure of similarity between two datasets (KG & Training Data) ● Statistical Significance Hypothesis Testing ● Word and Concept Features ● T-SNE Visualization of Clusters ● Area under perturbation curve: Feature Ranking ● Human-centric evaluation: Crowdsourcing, User Satisfaction, Mental Model, Trust Assessment, Correctability OF EV http://www-sop.inria.fr/members/Freddy.Lecue/presentation/ISWC2019-FreddyLecue-Thales-OnTheRoleOfK nowledgeGraphsInExplainableAI.pdf
  64. 64. 64 Knowledge Infusion: Abstractive Summarization of Clinical Diagnostic Interviews Problem Statement
  65. 65. 65 Knowledge Infusion: Abstractive Summarization of Clinical Diagnostic Interviews (Approach)
  66. 66. 66 Sentence length Trigram language modeling Informativeness Find best path for an interview slice Knowledge Infusion: Abstractive Summarization of Clinical Diagnostic Interviews
  67. 67. 67 BERT Abstractive Summarization using Integer Linear Programming (ILP) Abstractive Summarization using ILP and PHQ-9 Statistical Statistical + Constraints Statistical + Constraints + Knowledge Manas G, Aribandi V, Kursuncu U, Alambo A, Shalin VL, Thirunarayan K, Beich J, Narasimhan M, Sheth A Knowledge-infused Abstractive Summarization of Clinical Diagnostic Interviews , JMIR Preprints. 30/05/2020:20865 DOI: 10.2196/preprints.20865 URL: https://preprints.jmir.org/preprint/20865 Knowledge Infusion: Abstractive Summarization of Clinical Diagnostic Interviews
  68. 68. 6868 Really struggling with my bisexuality which is causing chaos in my relationship with a girl. Being a fan of LGBTQ community, I am equal to worthless for her. I’m now starting to get drunk because I can’t cope with the obsessive, intrusive thoughts, and need to get out of my head. BPD DICD PND SAD SBI OCD Don’t want to live anymore. Sexually assault, ignorant family members and my never ending loneliness brights up my path to death. SCW PND SBI SAD DPR DICD DPR I do have a potential to live a decent life but not with people who abandon me. Hopelessness and feelings of betrayal have turned my nights to days. I am developing insomnia because of my restlessness. SBI DPR DICD BPD I just can’t take it anymore. Been abandoned yet again by someone I cared about. I've been diagnosed with borderline for a while, and I’m just going to isolate myself and sleep forever. SBI PND Linking Reddit to DSM-5 : Web-based Intervention Reddit DSM-5 [Gaur 2018]
  69. 69. 6969 Mapping to SNOMED Concept Illustration Really struggling with my bisexuality which is causing chaos in my relationship with a girl. Being a fan of LGBTQ community, I am equal to worthless for her. I’m now starting to get drunk because I can’t cope with the obsessive, intrusive thoughts, and need to get out of my head. 288291000119102: High risk bisexual behavior 365949003: Health-related behavior finding 365949003: Health-related behavior finding 307077003: Feeling hopeless 365107007: level of mood 225445003: Intrusive thoughts 55956009: Disturbance in content of thought 26628009: Disturbance in thinking 1376001: Obsessive compulsive personality disorder
  70. 70. 7070 Mapping Reddit to DSM-5 Medical Knowledge Bases N-grams (n=1, 2, 3) LDA LDA over Bi-grams Normalized Hit Score DSM-5 Lexicon <Reddit Post> <Subreddit Label> Input <Reddit Post> <DSM-5 Label> Output DAO Drug Abuse Ontology
  71. 71. 71 Mapping Reddit to DSM-5 http://www.papersfromsidcup.com/graham-daveys-blog/changes-in-dsm-5
  72. 72. 7272 Reddit to DSM-5 Task I know you want me to say no and that it is a part of me blah blah blah. But I can't. Honestly, not having bipolar disorder would be a huge blessing. I would be so much happier and could control my life better. I wouldn't have frantic, scattered thoughts and depression. I would be normal, happy, and less dramatic. Bipolar Subreddit DSM-5: Depressive Disorder I know you want me to say no and that it is a part of me blah blah blah. But I can't. Honestly, not having bipolar disorder would be a huge blessing. I would be so much happier and could control my life better. I wouldn't have frantic, scattered thoughts and depression. I would be normal, happy, and less dramatic. BiPolar Depression Disorder Subreddits DSM-5 Chapter BiPolarReddit BiPolarSOS Depression Addiction Substance use & Addictive Disorder Crippling Alcoholism Opiates Recovery Opiates Self-Harm Stop Self-Harm
  73. 73. 7373 Semantic Encoding and Decoding Optimization 12808 Words 300 dimension embedding 300 dimension embedding 20 DSM-5 Categories R D Reddit Word Embedding Model DSM-5 -DAO Lexicon W Solvable Sylvester Equation
  74. 74. 74 Semantic Encoding and Decoding Optimization Encoding DSM-5 to Reddit embedding space Decoding Reddit to DSM-5 embedding space
  75. 75. 75 Outcome Domain-specific Knowledge lowers False Alarm Rates. 2005-2016 550K Users 8 Million Conversations 15 Mental Health Subreddits [Gkotsis 2017][Saravia 2016] [Park 2018]
  76. 76. 76 Method (with HLF, VLF, and FGF) Precision Recall F1-Score BRF- Contextual Features (CF) 0.60 0.54 0.57 BRF - CF (SEDO Weights generated from DSM-5 Lexicon without DAO) 0.87 0.77 0.82 BRF - CF (SEDO Weights generated from DSM-5 Lexicon with DAO without Slang Terms) 0.87 0.80 0.83 BRF - CF (SEDO Weights generated from DSM-5 Lexicon without DAO with Slang Terms) 0.85 0.82 0.83 BRF- CF (SEDO Weights generated from DSM-5 Lexicon with DAO and Slang Terms) 0.88 0.83 0.85 Outcome Model and Annotator Agreement: 84%
  77. 77. Mapping Social Media to EHR using KG 77 TwADR AskaPatient Drug Abuse Ontology DSM-5 Lexicon Suicide Risk Severity Lexicon Treatment Information Observation and Drug-related Information Mental Health Condition Suicide Risk Levels Ideation Behavior Attempt
  78. 78. 78 Mental Healthcare KB for Social Media
  79. 79. Resources TwADR and AskaPatient Lexicon https://zenodo.org/record/55013#.XsYEH8YpBQI Ref: Limsopatham, Nut, and Nigel Collier. "Normalising medical concepts in social media texts by learning semantic representation." Association for Computational Linguistics, 2016. Suicide-Risk Severity Lexicon https://bit.ly/SRS_lexicon Ref: Gaur, Manas, Amanuel Alambo, Joy Prakash Sain, Ugur Kurşuncu, Krishnaprasad Thirunarayan, Ramakanth Kavuluru, Amit Sheth, Randy Welton, and Jyotishman Pathak. "Knowledge-aware assessment of severity of suicide risk for early intervention." In The World Wide Web Conference, 2019. DSM-5 and Drug Abuse Ontology Lexicon https://bit.ly/DSM5_DAO Ref: Gaur, Manas, Ugur Kurşuncu, Amanuel Alambo, Amit Sheth, Raminta Daniulaityte, Krishnaprasad Thirunarayan, and Jyotishman Pathak. "" Let Me Tell You About Your Mental Health!" Contextualized Classification of Reddit Posts to DSM-5 for Web-based Intervention." In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018. Suicide Risk Severity Dataset (Reddit) https://zenodo.org/record/2667859#.XsYH7MYpBQI Ref: Gaur, Manas, Amanuel Alambo, Joy Prakash Sain, Ugur Kurşuncu, Krishnaprasad Thirunarayan, Ramakanth Kavuluru, Amit Sheth, Randy Welton, and Jyotishman Pathak. "Knowledge-aware assessment of severity of suicide risk for early intervention." In The World Wide Web Conference, 2019.
  80. 80. Other Works: Not Covered 80 Manas Gaur, Amanuel Alambo, Joy Prakash Sain, Ugur Kursuncu, Krishnaprasad Thirunarayan, Ramakanth Kavuluru, Amit Sheth, Randy Welton, and Jyotishman Pathak. Knowledge-aware assessment of severity of suicide risk for early intervention. In WWW 2019 Manas Gaur, Vamsi Aribandi, Amanuel Alambo, Ugur Kursuncu, Krishnaprasad Thirunarayan, Jonathan Beich, Jyotishman Pathak, and Amit Sheth Characterization of Time-variant and Time-invariant Assessment of Suicidality on Reddit using C-SSRS Under Review in Nature Scientific Reports Manas Gaur, Aditya Sharma, Ugur Kursuncu, Valerie L. Shalin and Amit Sheth Knowledge-Guided Convolutional Autoencoder Clustering for Associating Support Seeker and Support Providers in Online Mental Health Communities Amanuel Alambo and Krishnaprasad Thirunarayan Depressive, Drug Abusive, or Informative: Knowledge-aware Study of News Exposure during COVID-19 Outbreak. In ACM KDD KiML Workshop 2020
  81. 81. References ● Manas Gaur, Ugur Kursuncu, Amanuel Alambo, Amit Sheth, Raminta Daniulaityte, Krishnaprasad Thirunarayan, and Jyotishman Pathak. "" Let Me Tell You About Your Mental Health!" Contextualized Classification of Reddit Posts to DSM-5 for Web-based Intervention." CIKM, 2018. ● Manas Gaur*, Chidubem Arachie*, Sam Anzaroot, William Groves, Ke Zhang, and Alejandro Jaimes. "Unsupervised Detection of Sub-Events in Large Scale Disasters." AAAI 2020. ● Manas Gaur, Amanuel Alambo, Joy Prakash Sain, Ugur Kursuncu, Krishnaprasad Thirunarayan, Ramakanth Kavuluru, Amit Sheth, Randy Welton, and Jyotishman Pathak. "Knowledge-aware assessment of severity of suicide risk for early intervention." WWW 2019. ● Manas Gaur, Saeedeh Shekarpour, Amelie Gyrard, and Amit Sheth. "empathi: An ontology for emergency managing and planning about hazard crisis." ICSC, 2019. ● Gyrard, Amelia, Manas Gaur, Saeedeh Shekarpour, Krishnaprasad Thirunarayan, and Amit Sheth. "Personalized health knowledge graph." ISWC 2018. ● Amit Sheth, Manas Gaur, Ugur Kursuncu, and Ruwan Wickramarachchi. "Shades of knowledge-infused learning for enhancing deep learning." IEEE Internet Computing 2019. ● Shreyansh Bhatt, Manas Gaur, Beth Bullemer, Valerie Shalin, Amit Sheth, and Brandon Minnery. "Enhancing crowd wisdom using explainable diversity inferred from social media." Web Intelligence 2018. ● Kursuncu, Ugur, Manas Gaur, and Amit Sheth. "Knowledge Infused Learning (K-IL): Towards Deep Incorporation of Knowledge in Deep Learning.", AAAI Spring Symposium 2020. ● Williams, Ronald J., and David Zipser. "A learning algorithm for continually running fully recurrent neural networks." Neural computation, 1989. ● Lamb, Alex M., Anirudh Goyal Alias Parth Goyal, Ying Zhang, Saizheng Zhang, Aaron C. Courville, and Yoshua Bengio. "Professor forcing: A new algorithm for training recurrent networks." NIPS 2016. ● Yang, Bishan, and Tom Mitchell. "Leveraging Knowledge Bases in LSTMs for Improving Machine Reading." ACL 2017. ● Hu, Zhiting, Zichao Yang, Russ R. Salakhutdinov, L. I. A. N. H. U. I. Qin, Xiaodan Liang, Haoye Dong, and Eric P. Xing. "Deep generative models with learnable knowledge constraints.” NIPS 2018. 81
  82. 82. 82 Questions we address next
  83. 83. Cyber Social Threats Artificial Intelligence Institute kursuncu@mailbox.sc.edu @UgurKursuncu
  84. 84. Critical Points on Cyber Social Threats ● Context in social media conversations is fluid and shades of gray. ● False alarms in the models developed and deployed. ● Ethical considerations and consequences. Bias and transparency. Implications on the mass population. ● The role of knowledge on improving the model in these critical points. 84 Photo: @budhelisson Unsplash.com
  85. 85. Online Extremism - Ongoing Open Problem 85 ● Efforts by online platforms are inadequate. ● Governments insist that the industry has a ‘social responsibility’ to do more to remove harmful content. ● If unsolved, social media platforms will continue to negatively impact the society.
  86. 86. Online Extremism - Covid 19 8686
  87. 87. (e.g., recruiter, follower) with respect to different stages of radicalization. Modeling users content and psychological process over time. Persuasive relevant to Islamist extremism. Domain Knowledge of the context (“jihad” has different meaning in different context) Multidimensionality Radicalization Challenges & Potential Solutions
  88. 88. 88 0 None Mainstream religious views and orientations Indicator: Islam; Allah; jihad (self struggle); halal; democracy, islam, salah, fatwa, hajj. 1 Low Attitudinal support for politically moderate Islamism Indicator: Hadith; Caliphate (Khilafah) justified; Sharia better (than secular law); Hypocrisy west. 2 Elevated Emergent support for exclusive rule of the Shari’a law Indicator: Shariah best; revenge (justified); jihad (against West); justify Daesh (ISIS) 3 High Support for extremist networks and travel to “Darul Islam” Indicator: Kafir; infidel; hijrah to Darul-Islam; (supporting) fatwa Al-Awlaki; mushrikeen. 4 Severe Call for action to join the fight and the use of violence. Indicator: apostate; sahwat; taghut; kill; kafir; kuffar; murtadd; tawaghit; al_baghdadi; martyrdom khilafah Radicalization Scale (Dilshod Achilov et al.)
  89. 89. 89 Analysis of content in context can provide deeper understanding of the factors characterizing the radicalization process. Non-extremist ordinary individual Radicalized extremist individual 0 1 2 4 SevereHighLowNone Elevated 3 Radicalization Process over time
  90. 90. Cautionary Note 90 Specifically, unfair classification of non-extremist individuals as extremist. False alarm might potentially impact millions of innocent people. Local and Global security implications, Need for reliable and fair of predicting online terrorist activities.
  91. 91. ● Verified and suspended by Twitter. ● Time frame: Oct 2010 – Aug 2017 ● Includes 538 extremist users, from two resources. (Fernandez, 2018) (Ferrara, 2016) ○ Twitter verified users by anti-abuse team. ○ Lucky Troll Club ● 538 Non-extremist users were created from an annotated muslim religious dataset that contains Muslim users. (Chen, 2014) -Miriam Fernandez, Moizzah Asif, and Harith Alani. 2018. Understanding the roots of radicalisation on twitter. In Proceedings of the 10th ACM Conference on Web Science. -Emilio Ferrara, Wen-Qiang Wang, Onur Varol, Alessandro Flammini, and Aram Galstyan. 2016. Predicting online extremism, content adopters, and interaction reciprocity. In International conference on social informatics. -Chen, L., Weber, I., & Okulicz-Kozaryn, A. (2014, November). US religious landscape on Twitter. In International Conference on Social Informatics (pp. 544-560). Springer, Cham. Dataset
  92. 92. Extremist Content 92 Prevalent Key Phrases Prevalent Topics isis, syria, kill, iraq, muslim, allah, attack, break, aleppo, assad, islamicstate, army, soldier, cynthiastruth, islam, support, mosul, libya, rebel, destroy, airstrike Caliphate_news, islamic_state, iraq_army, soldier_kill, iraqi_army, syria_isis, syria_iraq, assad_army, terror_group, shia_militia, isis_attack, aleppo_syria, martyrdom_operation, ahrar_sham, assad_regime, follow_support, lead_coalition, turkey_army, isis_claim, kill_isis Imam_anwar_awlaki, video_message_islamicstate, fight_islamic_state, isisclaim_responsibility_attack, muwahideen_powerful_middleeast, isis_tikrit_tikritop, amaqagency_islamicstate_fighter, sinai_explosion_target, alone_state_fighter, intelligence_reportedly_kill, khilafahnew_islamic_state, yemanqaida_commander_kill, isis_militant_hasakah, breakingnew_assad_army, isis_explode_middle, hater_trier_haleemah, trust_isis_tighten, qamishlus_isis_fighting, defeat_enemy_allah, kill_terrorist_baby, ahrar_sham_leader islamic state, syria, isis, kill, allah, video, minute propaganda video scenes, jaish islam release, restock missile, kaffir, join isis, aftermath, mercy, martyrdom operation syrian opposition, punish libya isis, syria assad, islam sunni, swat, lose head, wilayatalfurat, somali, child kill, takfir, jaish fateh, baghdad, iraq, kashmir muslim, capture, damascus, report rebel, british, qala moon, jannat, isis capture, border cross, aleppo, iranian soldier, tikrit tikrittop, lead shia military kill, saleh abdeslam refuse cooperate Green: Religion Blue: Ideology Red: Hate Corpus: 538 Twitter verified extremists, 48K tweets
  93. 93. 93 ● Dimensions to define the context: ○ Based on literature and our empirical study of the data, three contextual dimensions are identified: Religion, Ideology, Hate ● The distribution of prevalent terms (i.e., words, phrases, concepts) in each dimension is different. ● Different dimensions needed to contextualize and disambiguate common ‘diagnostic’ terms (e.g., jihad). Multidimensionality of Extremist Content
  94. 94. 94 “Reportedly, a number of apostates were killed in the process. Just because they like it I guess.. #SpringJihad #CountrysideCleanup” “Kindness is a language which the blind can see and the deaf can hear #MyJihad be kind always” “By the Lord of Muhammad (blessings and peace be upon him) The nation of Jihad and martyrdom can never be defeated” “Jihad” can appear in tweets with different meanings in different dimensions of the context. H I R Example Tweets with “Jihad”
  95. 95. 95 ● Same term can have different meanings for each dimensions. ● Example: “Meaning of Jihad” is different for extremists and non-extremists. ○ For extremists, meaning closer to “awlaki”, “islamic state”, “aqeedah” ○ For non-extremists, closer to “muslims”, “quran”, “imams” ExtremistsNon-Extremists Ambiguity of Diagnostic terms/phrases
  96. 96. Contextual Dimension Modelling 96 ● Different Contextual Dimensions incorporating: ○ Knowledge Graphs ○ Dimension Corpora ● Utilization of Machine/Deep Learning models, generate knowledge-enhanced representations ● Resources for Dimensions: Religion: Qur’an, Hadith Ideology: Books, lectures of ideologues Hate: Hate Speech Corpus (Davidson et al. 2017) ● Can be applied over many social problems. Modeling Modeling Modeling Dimension 1 Dimension 2 Dimension 3 DimensionDimensionDimension Dimension Modeling Process Dimension based Knowledge enhanced Representation
  97. 97. (Hate) Using a Knowledge Graph “You shall know the word by the company it keeps” - J.R. Firth (1957:11) 97 Capturing similarity: ● Learning word similarities from a substantial knowledge graph ● A solution via distance between concepts in the knowledge graph. Modeling
  98. 98. (Hate) Using a Corpus “You shall know the word by the company it keeps” - J.R. Firth (1957:11) Capturing similarity (and resolving ambiguity): ● Learning word similarities from a large corpora. ● A solution via distributional similarity-based representations. 98 Modeling
  99. 99. ● For religion: Extremist and non-extremist users are significantly similar to each other. ● For hate: Extremist and non-extremist users do not show much similarity. Religion Ideology NonExtremists Extremists 99 Religion Ideology Hate User Similarity
  100. 100. ● For religion and hate, among extremists: There seems to be a number of users that are significantly different from each other. ● Possibility of outliers. Extremists Extremists 100 Religion Ideology Hate User Similarity
  101. 101. ● A group of extremist users, form a cluster farther from other users for Religion and Hate. ● Suggesting there might be outliers in the dataset. 101 User Visualization for Dimensions
  102. 102. ● Randomly selected 10 users and visualize for each dimension. ● Repeated this selection many times, every time same users formed a separate cluster. In this case below, the users are D, A. 102 Random 10 Users User Visualization for Dimensions
  103. 103. ● Identified 99 (18%), 48 (9%) and 141 (26%) users in the extremist dataset, clustered as likely outliers for religion, ideology and hate, respectively. ● A random sample of 76 users (15% ) from the extremist dataset, to validate the identified potential likely outliers. ● Our domain expert annotated these users as likely extremist, likely extremist and unclear. Kappa Score = 82% Separation of users within the extremist dataset through clustering Mann-Whitney U-test Outlier Detection
  104. 104. ● Obtained the set of 49 outlier users in the extremist dataset. Rest is labeled as likely extremists ● Content of the outlier users contains the following prevalent concepts: marriage, Allah, bonded, silence, Islam leaders, Berjaya hilarious, cake, miss mit, kemaren, Quran, Khuda, prophet, Muhammad, Ahmad. Separation of users within the extremist dataset through clustering Outliers
  105. 105. Results 105 ● Tri-dimension model performs best. ● Precision used as metric, to emphasize reduction on misclassification of non-extremist content. ● Implications in a large scale application.
  106. 106. ● Domain Specific Knowledge plays critical role and importance of ground truth for such complex problems. ● False alarms: significantly reduced via incorporation of three domain specific dimensions. It further reduces the likelihood of an unfair mistreatment towards non-extremist individuals, in a potential real world deployment. ● Misclassification of non-extremist users can have significant implications in a large-scale application where non-extremists vastly outnumber extremists. ● Higher precision reduces potential social discrimination. 106 Key Insights
  107. 107. ● Extremist users employ religion along with hate, suggesting they employ different hate tactics for their targets. ● Each dimension plays different roles in different levels of radicalization, capturing nuances as well as linguistic and semantic cues better throughout the radicalization process. 107 Key Insights
  108. 108. Our Highly Multidisciplinary Approach 108 Public/ Society Social Interactions Cognitive Neuro Cognitive Process ● Human brain processes information from extremist narratives on social media, that includes different contexts, emotions, sentiment, etc. ● Individuals change behavior, make choices in consuming/sharing content with an intent. ● Coordination, information flow and diffusion on social networks. ● Outcomes/impact on society through events and collective actions (eg, civil war or result of an election). Neural
  109. 109. References ● Ugur Kursuncu. “Modeling the Persona in Persuasive Discourse on Social Media Using Context-aware and Knowledge-driven Learning.” University of Georgia. 2018. ● Ugur Kursuncu, Manas Gaur, Carlos Castillo, Amanuel Alambo, Krishnaprasad Thirunarayan, Valerie Shalin, Dilshod Achilov, I. Budak Arpinar, and Amit Sheth. "Modeling Islamist extremist communications on social media using contextual dimensions: Religion, ideology, and hate." CSCW 2019. ● Ugur Kursuncu, Manas Gaur, and Amit Sheth. "Knowledge infused learning (K-IL): Towards deep incorporation of knowledge in deep learning." Proceedings of the AAAI 2020 Spring Symposium on Combining Machine Learning and Knowledge Engineering in Practice (AAAI-MAKE 2020). ● Ugur Kursuncu, Manas Gaur, Usha Lokala, Krishnaprasad Thirunarayan, Amit Sheth, and I. Budak Arpinar. "Predictive analysis on Twitter: Techniques and applications." Springer Nature 2019. ● Ugur Kursuncu, Manas Gaur, Usha Lokala, Anurag Illendula, Krishnaprasad Thirunarayan, Raminta Daniulaityte, Amit Sheth, and I. Budak Arpinar. "What's ur Type? Contextualized Classification of User Types in Marijuana-Related Communications Using Compositional Multiview Embedding." Web Intelligence 2018. ● Ugur Kursuncu, Manas Gaur, Krishnaprasad Thirunarayan, Amit Sheth, “Explainability of medical ai through domain knowledge”, Ontology Summit Communications 2019. 109
  110. 110. Knowledge Graphs for Autonomous Driving Artificial Intelligence Institute ruwan@email.sc.edu @ruwantw
  111. 111. Overview 111 Context Understanding
  112. 112. Approach 112 Evaluation of Knowledge Graph Embeddings (KGEs) for the Automotive Driving Domain
  113. 113. Building the Knowledge Graph 113
  114. 114. Building the Knowledge Graph 114
  115. 115. Translating the KG into a KG Embedding 115
  116. 116. Translating the KG into a KG Embedding 116
  117. 117. Intrinsic Evaluation: Overview 117
  118. 118. Intrinsic Evaluation: KGs w/ various levels of information 118
  119. 119. Intrinsic Evaluation: KGs w/ various levels of information 119
  120. 120. Intrinsic Evaluation: KGs w/ various levels of information 120
  121. 121. Intrinsic Evaluation: KGs w/ various levels of information 121 Intrinsic Evaluation - Results of Lyft
  122. 122. Intrinsic Evaluation: KGs w/ various levels of information 122 Intrinsic Evaluation - Results of NuScenes
  123. 123. Application 123
  124. 124. Conclusion 124
  125. 125. References ● Ruwan Wickramarachchi, Cory Henson, and Amit Sheth. "An evaluation of knowledge graph embeddings for autonomous driving data: Experience and practice." AAAI Spring Symposium 2020. ● Oltramari, Alessandro, Jonathan Francis, Cory Henson, Kaixin Ma, and Ruwan Wickramarachchi. "Neuro-symbolic Architectures for Context Understanding." Knowledge Graphs for eXplainable AI, IOS Press 2020. ● Alessandro Oltramari, Cory Henson, Ruwan Wickramarachchi, Don Brutzman and Richard Markeloff. “Hybrid AI for Context Understanding” 3rd U.S. Semantic Technologies Symposium, Raleigh, NC 2020 https://us2ts.org/2020/program-hybrid-ai ● Cory Henson, Stefan Schmid, Anh Tuan Tran, and Antonios Karatzoglou. "Using a Knowledge Graph of Scenes to Enable Search of Autonomous Driving Data." In ISWC Satellites, pp. 313-314. 2019. ● Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. "nuscenes: A multimodal dataset for autonomous driving." In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11621-11631. 2020. ● Kesten, R., M. Usman, J. Houston, T. Pandya, K. Nadhamuni, A. Ferreira, M. Yuan et al. "Lyft level 5 av dataset 2019." ● Alshargi, Faisal, Saeedeh Shekarpour, Tommaso Soru, and Amit Sheth. “Metrics for evaluating quality of embeddings for ontological concepts”. AAAI Spring Symposium 2019. 125
  126. 126. Knowledge-Infused NLP for Understanding Content on DarkNet Artificial Intelligence Institute shweta@knoesis.org @shweta_yadav_3
  127. 127. Overview 127
  128. 128. Research Question 128 Does semantically enriching the natural language processing algorithm with domain-specific knowledge increase the coverage in text understanding?
  129. 129. Darknet and Anonymity 129 Access Cryptomarkets provide anonymity to both buyers and sellers: • Location on the Dark Web, which requires specific software to access (e.g., Tor, I2P); • Use of untraceable cryptocurrencies (e.g., bitcoin); • Privacy and anonymity; Approximately two thirds of the goods sold on cryptomarkets are drugs (EMCDDA, 2018).
  130. 130. Transaction in Cryptomarket 130 Cryptocurrencies • Based on centralized blockchain technologies • Identified by encrypted code • Approximately 1800 cryptocurrencies • Most commonly used on cryptomarkets: Bitcoin, Litecoin, Monero. Image Source: 1. Zhang, Yiming, et al. "Your style your identity: Leveraging writing and photography styles for drug trafficker identification in darknet markets over attributed heterogeneous information network." The World Wide Web Conference. 2019. Image Source 2. https://www.investopedia.com/terms/b/blockchain.asp
  131. 131. Motivation 131 ◉ Darknet markets have grown substantially even with government interventions from 2013-2016 [1] [1] Kristy Kruithof. 2016. Internet-facilitated drugs trade: An analysis of the size, scope and the role of the Netherlands. RAND. Feature Growth Total revenue 2x Total number of transactions 3x Total number of listings 5.5x Total number of listings per vendor 2x Incremental growth of the Darknet Market [1]
  132. 132. Motivation 132 ◉ Drug Traffickers may maintain multiple accounts across different markets or in the same market ◉ Linking different accounts to the same individuals is essential to track their status and better understand the online drug trafficking ecosystem ◉ Illegal trading of drugs in these markets has turned into a serious global concern because of its severe consequences on society (e.g., violent crimes) and public health at regional, national and international levels
  133. 133. Snapshot of Darknet Market 133
  134. 134. Problem Statement 134 ◉ The task involves the detection of similarity between two vendors on online forums, i.e., Darknet, Reddit, and Twitter. (Identification of sybil accounts) ◉ Formally, given any two vendors va and vb associated with the respective sites si and sj , our goal is to develop a similarity measure sim(va si , vb sj ) between the two vendors using various characteristics/patterns.
  135. 135. Dataset Creation 135 ◉ Data extracted using eDarkTrends platform [5] with 1992 unique vendors collected over 3 different sites. [5] Usha Lokala, Francois R Lamy, Raminta Daniulaityte, Amit Sheth, Ramzi W Nahhas, Jason I Roden, Shweta Yadav, and Robert G Carlson. 2019. Global trends, local harms: availability of fentanyl-type drugs on the dark web and accidental overdoses in Ohio. Computational and Mathematical Organization Theory 25, 1 (2019), 48–59. Dark Web Sites Dream Market Tochka Wall street All Unique # Vendor names 1448 408 466 1992 Unique # Substance 852 313 290 1148 Unique # Location 356 44 29 389 Unique # Descriptions 16800 1829 1723 18472
  136. 136. Methodology 136
  137. 137. Methodology: Modelling Multi-view Learning 137 ◉ Multi-view learning is an ideal learning mechanism for the data where examples are characterized by distinct (often orthogonal) feature sets. ◉ Generalize and improve the performance by exploiting the diverse views from multiple rich sources such as textual, stylometric, and location representation. Image Source: 1. Zhang, Yiming, et al. "Your style your identity: Leveraging writing and photography styles for drug trafficker identification in darknet markets over attributed heterogeneous information network." The World Wide Web Conference. 2019.
  138. 138. Summary of Approach: eDarkFind 138
  139. 139. Knowledge Infusion: Drug Abuse Ontology 139 ◉ The Drug Abuse Ontology (DAO) is a formal representation of concepts and relationships between them for the prescription drug abuse domain. ◉ The current DAO contains 241 classes and 37 properties. ◉ DAO identify all variants of a concept in data (e.g., generic names, slang terms, scientific names). ◉ DAO contains names of psychoactive substances (e.g., heroin, fentanyl), including synthetic substances (e.g., U-47,700, MT-45), brand and generic names of pharmaceutical drugs (e.g., Duragesic, fentanyl transdermal system) and slang terms (e.g., roxy, fent).
  140. 140. Knowledge Infusion: Drug Abuse Ontology 140 Augmentation of drug slang terms enables understanding of Drug Abuse-related textual description that was not explored well at all.
  141. 141. Knowledge Infusion: Drug Abuse Ontology 141 ◉ DAO contains information regarding the route of administration (e.g., oral, IV), unit of dosage (e.g., gr, gram, pint, tablets), physiological effects (e.g., dysphoria, vomiting) and substance form (e.g., powder, liquid, hcl) ◉ The DAO is also enriched with links to concepts in external ontologies, through a very careful manually supervised process. Among the 43 DAO classes, 11 classes have been mapped to URIs in DrugBank, Freebase, DBpedia and the Cyc ontologies, using the sameAs property.
  142. 142. Knowledge Infusion: Drug Abuse Ontology 142
  143. 143. Substance View 143
  144. 144. Location and Substance View Encoding 144 ◉ Utilize simple binary encoding to obtain the view representation: ◉ Add a self information weight or information content, for all features Information content USA CAN ESP IND CHN BEL NOR NZL SAU UKR 1 1 0 0 0 0 0 0 0 0
  145. 145. Multi-view Fusion-Canonical Correlation Analysis 145 ◉ Cannot simply concatenate since each vector may correspond to different modalities (image vs text) or very different distributional properties ◉ These views are fused using CCA [9] to obtain a single representation, which we call Vendor embedding ◉ Allows us to infer information from cross variance matrices ◉ Employ an extension called weighted generalized CCA. [9] Harold Hotelling. 1992. Relations between two sets of variates. In Breakthroughs in statistics. Springer, 162–190.
  146. 146. Results 146 Performance metric of our model on different datasets Highest average accuracy across all datasets
  147. 147. Results: Ablation Study 147Performance metric of various models on All sites combined. Best performance
  148. 148. Domain Specific Analysis 148 ◉ Usage of Multilingual and Code-mixed text ◉ Use slang terms across listings captured by our model (e.g., horse for heroin) ◉ Lack of uniform features in website adds noise to our model (product description and rating data) ◉ Some vendors may operate from different locations or may even be selling different drugs ◉ Branding (posting favorable reviews) is common in these markets
  149. 149. Use Case Examples 149 Case Studies @Vendor 1 @Vendor 2 Branding 5//02/14 09:49 am,5/Thanks alles schick/11/10 01:46 pm, <END>Tilidin 50MG/4MG Original Apothekenware 5//02/14 09:49 am,5/Thanks alles schick/11/10 01:46 pm, <END>Tilidin 50 MG/4MG Original Apothekenware <END> 5/Thanks alles schick/11/10 01:46 pm, Comparing product Description and rating since the vendor did not enter product description in other site. Percocet Oxycodone 5/325 mg 200 Tablets Finalize Early and get 20 Free bonus sent for a total of 220!US Made Mallinckrodt 5mg/325 (made in St. Louis, Miss. USA) ... 5//02/07 01:03 pm,5/Thanks Again. A++/01/21 11:49 pm,5/Trustworthy/01/16 12:22 pm,4.33//01/07 08:50 am,5/Great communication, trustworthy, and over delivered./12/31 11:09 pm,5//11/29 03:25 pm,5/FAST A+++ Best Stealth I’ve seen yet. Similar stylometric Features captured by the use of special characters or emojis. ————————————— ————————————— ****** NEWS 25.12.2018 NEWS ****** ————————————— ————————————— We ship all new ... ————————— ————————— PRODUCTS ————————— ————————— AFGHAN HEROIN A+++COCAINE #3 ...
  150. 150. Conclusion 150 ◉ 98% ACCURACY: Developed Multi-view learning Sybil account detection on the real-life Darknet market dataset achieving an accuracy of 98%. ◉ UNSUPERVISED LEARNING: Utilizing unlabelled data to train the network. ◉ DOMAIN ADAPTATION: Performed cross-domain analysis to justify uniform result. ◉ KNOWLEDGE-INFUSED NLP: Proved the effectiveness of utilizing domain specific knowledge graph of drug (DAO) in textual content understanding on DarkNet.
  151. 151. References ● Ramnath Kumar, Shweta Yadav, Raminta Daniulaityte, Francois Lamy, Krishnaprasad Thirunarayan, Usha Lokala, and Amit Sheth. "eDarkFind: Unsupervised Multi-view Learning for Sybil Account Detection." The Web Conference. 2020. ● Zhang, Yiming, et al. "Your style your identity: Leveraging writing and photography styles for drug trafficker identification in darknet markets over attributed heterogeneous information network." The World Wide Web Conference. 2019. ● Delroy Cameron, Gary A Smith, Raminta Daniulaityte, Amit P Sheth, Drashti Dave, Lu Chen, Gaurish Anand, Robert Carlson, Kera Z Watkins, and Russel Falck. 2013. PREDOSE: a semantic web platform for drug abuse epidemiology using social media. Journal of biomedical informatics 46, 6 (2013), 985–997 ● Usha Lokala, Francois R Lamy, Raminta Daniulaityte, Amit Sheth, Ramzi W Nahhas, Jason I Roden, Shweta Yadav, and Robert G Carlson. 2019. Global trends, local harms: availability of fentanyl-type drugs on the dark web and accidental overdoses in Ohio. Computational and Mathematical Organization Theory 25, 1 (2019), 48–59. ● Xiangwen Wang, Peng Peng, Chun Wang, and Gang Wang. 2018. You are your photographs: Detecting multiple identities of vendors in the darknet marketplaces. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security. ACM, 431–442. 151
  152. 152. Ongoing Research at #AIISC 152 Detection of Early Onset of Colorectal Cancer using Digestive Inflammation Index Conversational Systems for Nutrition Monitoring of High School Children Cyber Social Threats Conversational Systems for pediatric patients with Neutropenia, asthma in children, and obesity and hypertension in adults. Development of an Instrumented, Intelligent Infant Interaction Laboratory for the Prediction of Autism Spectrum Disorder Current Collaboration across UofSC: ● College of Medicine (>5) ● College of Nursing (2) ● College of Arts & Science (2) ● College of Pharmacy (2) ● College of Information & Communication ● College of Engineering & Computing ● College Education
  153. 153. AIISC and Collaborators 153 5 faculty, >12 PhDs, few Masters, >5 undergrads, 2 Post-Docs, >10 Research Interns Alumni in/as Industry: IBM T.J. Watson, Almaden, Amazon, Samsung America, LinkedIn, Facebook, Bosch Start-ups: AppZen, AnalyticsFox, Cognovi Labs Faculty: George Mason, University of Kentucky, Case Western Reserve, North Carolina State University, University of Dayton
  154. 154. http://aiisc.ai/ We acknowledge partial support from the National Science Foundation (NSF) award CNS-1513721: “Context-Aware Harassment Detection on Social Media", National Institutes of Health (NIH) award: MH105384-01A1: “Modeling Social Behavior for Health- care Utilization in Depression", and National Institute on Drug Abuse (NIDA) Grant No. 5R01DA039454-02 “Trending: Social media analysis to monitor cannabis and synthetic cannabinoid use”. Any opinions, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF, NIH, or NIDA.

×