Successfully reported this slideshow.

Semantics of the Black-Box: Using knowledge-infused learning approach to make AI systems more interpretable and explainable

0

Share

Loading in …3
×
1 of 56
1 of 56

Semantics of the Black-Box: Using knowledge-infused learning approach to make AI systems more interpretable and explainable

0

Share

Download to read offline

The recent series of innovations in deep learning have shown enormous potential to impact individuals and society, both positively and negatively. The deep learning models utilizing massive computing power and enormous datasets have significantly outperformed prior historical benchmarks on increasingly difficult, well-defined research tasks across technology domains such as computer vision, natural language processing, signal processing, and human-computer interactions. However, the Black-Box nature of deep learning models and their over-reliance on massive amounts of data condensed into labels and dense representations pose challenges for the system’s interpretability and explainability. Furthermore, deep learning methods have not yet been proven in their ability to effectively utilize relevant domain knowledge and experience critical to human understanding. This aspect is missing in early data-focused approaches and necessitated knowledge-infused learning and other strategies to incorporate computational knowledge. Rapid advances in our ability to create and reuse structured knowledge as knowledge graphs make this task viable. In this talk, we will outline how knowledge, provided as a knowledge graph, is incorporated into the deep learning methods using knowledge-infused learning. We then discuss how this makes a fundamental difference in the interpretability and explainability of current approaches and illustrate it with examples relevant to a few domains.

The recent series of innovations in deep learning have shown enormous potential to impact individuals and society, both positively and negatively. The deep learning models utilizing massive computing power and enormous datasets have significantly outperformed prior historical benchmarks on increasingly difficult, well-defined research tasks across technology domains such as computer vision, natural language processing, signal processing, and human-computer interactions. However, the Black-Box nature of deep learning models and their over-reliance on massive amounts of data condensed into labels and dense representations pose challenges for the system’s interpretability and explainability. Furthermore, deep learning methods have not yet been proven in their ability to effectively utilize relevant domain knowledge and experience critical to human understanding. This aspect is missing in early data-focused approaches and necessitated knowledge-infused learning and other strategies to incorporate computational knowledge. Rapid advances in our ability to create and reuse structured knowledge as knowledge graphs make this task viable. In this talk, we will outline how knowledge, provided as a knowledge graph, is incorporated into the deep learning methods using knowledge-infused learning. We then discuss how this makes a fundamental difference in the interpretability and explainability of current approaches and illustrate it with examples relevant to a few domains.

More Related Content

Similar to Semantics of the Black-Box: Using knowledge-infused learning approach to make AI systems more interpretable and explainable

Related Books

Free with a 14 day trial from Scribd

See all

Related Audiobooks

Free with a 14 day trial from Scribd

See all

Semantics of the Black-Box: Using knowledge-infused learning approach to make AI systems more interpretable and explainable

  1. 1. Semantics of the Black-Box: Using knowledge-infused learning approach to make AI systems more interpretable and explainable Keynote @ KGSWC 2020: http://www.kgswc.org/
  2. 2. 2 Amit Sheth Founding Director, Artificial Intelligence Institute http://aiisc.ai The University of South Carolina amit@sc.edu https://www.linkedin.com/in/amitsheth/ Special Thanks Kaushik Roy AIISC, kaushikr@email.sc.edu Manas Gaur AIISC, mgaur@email.sc.edu Some of the K-iL collaborators: Ruwan Wickramarachchi (AI Institute) Shweta Yadav Ugur Kurşuncu (AI Institute) Keyur Faldu (Embibe Inc.) Qi Zhang (AI Institute) Vishal Pallagani (AI Institute) ...
  3. 3. “ 3 AIISC in core AI areas, and interdisciplinary AI/AI applications
  4. 4. Outline of the talk ❏ Knowledge Graph ❏ Knowledge Graph meets Deep Learning: Knowledge-infused Learning ❏ K-IL in Explainability and Interpretability in Healthcare ❏ K-IL for Explainability and Interpretability in Adaptive Contagion Control ❏ K-IL : Explainable Improving of Learning Outcomes 4
  5. 5. 1. Knowledge Graph 5
  6. 6. Definition 6 Knowledge Graphs (KG) is a structured knowledge in a graph representation (in many cases, labeled property graph, or RDF or its variants). We cannot escape the class expressivity- computability Tread-off. Community is still debating exact definition. Key differentiator: Relationships (“relationships at the heart of semantics”). Different/Related forms: ● Ontology : Knowledge graph after human curation of entities and relations; “ontological commitment”, richer KR ● Knowledge Base: flattened graph ● Lexicons: Small application-specific flattened graph ● Knowledge Networks (KN) integrate and combine knowledge (usually captured as KGs) to serve a network (community). Knowledge Graphs and Knowledge Networks: The Story in Brief
  7. 7. 7 First commercial semantic search/browsing/… on the Web and for the content on the Web using KG. Term used for KR: WorldModel, Ontology http://bit.ly/15yrSemS Creation and Use of Knowledge ~ 2000
  8. 8. Proliferation Broad-based & Domain-Specific KGs 8 Examples of General Purpose Knowledge Graphs 1. DBpedia [Auer 2007, Lehmann 2015] 2. Yago [Rebele 2016] 3. Freebase [Bollacker 2008] 4. ConceptNet [Speer 2017] 5. Knowledge Vault [Dong 2014] 6. NELL [Mitchell 2018] 7. Wikidata [Vrandečić 2014] Example of Healthcare-specific Knowledge Graphs 1. SNOMED-CT [ACL Chang 2020] 2. Unified Medical Language System (UMLS) [Yip 2019] 3. DataMed [JAMIA Chen 2018] 4. International Classification of Diseases (ICD-10) [JAMIA Choi 2016] 5. DrugBank, Rx-NORM and MedDRA [ BMC Celebi 2019] 6. Drug Abuse Ontology [BMI Cameron 2013] Many are also community-developed.
  9. 9. Enterprise Knowledge Graphs are also very popular 9 KG enabled Web and Enterprise Applications: Google, Amazon, Microsoft, Siemens, LinkedIn, Airbnb, eBay, and Apple, as well as smaller companies (e.g. ezDI, Franz, Metaphactory/ Metaphacts, Semantic Web Company, Mondeca, Stardog, Diffbot, Siren). Enterprise KG development service is also available. (Maana). Industry-Scale Knowledge Graphs: Lessons and Challenges (Communications of the ACM, August 2019)
  10. 10. 10 Health Knowledge GraphEmpathi Ontology IRI: https://w3id.org/empathi/1.0 Download: https://raw.githubusercontent.com/shekarpour/emp athi.io/master/empathi.owl [Shah and Sheth US patent 2015]
  11. 11. “ 11Purohit, Hemant, Valerie L. Shalin, and Amit P. Sheth. "Knowledge Graphs to Empower Humanity-Inspired AI Systems." IEEE Internet Computing 24.4 (2020): 48-54.
  12. 12. Reasoning Knowledge and Experience Intuition behind why deep learning needs knowledge -- Intelligence: Learning from Data + Knowledge/Experience + Reasoning
  13. 13. Why Knowledge Graphs? Challenges in NLP/NLU ● Natural Language Processing Challenges: ○ How do you learn quickly from small amount of data? ○ How do you mine (varied) relationships from existing text? ○ How do you reliably classify entities into known ontology? ○ Better contextualization of words ● Natural Language Understanding Challenges: ○ Query Interpretation or Understanding the user question ○ Answering the question with Trust and Transparency ○ How to measure “reasonability” and “meaningfulness” of the response to a question? ○ How much context is needed to provide a precise response? [Stanford Knowledge Graph Seminar 2020, Amit Prakash , Leilani Gilpin] 13
  14. 14. Better Contextualization of Words : Retrofitting 14 Why Knowledge Graphs : NLP/NLU Challenges damage Infrastructure affected population damage Infrastructure affected population Vector representation of words in Tweets (embedding) before retrofitting Vector representation of words in Tweets after retrofitting MOAC Ontology Empathi ontology Disaster Ontology DBpedia
  15. 15. 15 Why Knowledge Graphs: NLP/NLU Challenges Better NLU, hence more context for precise response search Python APIs: ● Rdflib: Dbpedia ● SNOWSTORM: SNOMED-CT ● Berkeley Neural Parser ● GDELT
  16. 16. 16 Knowledge Extraction Knowledge Alignment Knowledge Cleaning Knowledge Mining & Knowledge-based QA Data Extraction (NLP, Web) Wrapper Induction (DB, DM-Data Mining) Web Tables (DB) Text Mining (DM) Entity and Relationship Linking [Perera 2016] Schema Mapping and Ontology Mapping [Jain 2010] Universal Schema [Sheth 1990] Data Cleaning [Jadhav 2016] Anomaly Detection [Anantharam 2012, 2016] Knowledge Fusion [Sheth 2020, Kapanipathi 2020, Gaur 2018, Kursuncu 2020] Graph Mining [Lalithsena 2016, 2017, 2018] Knowledge Embedding [Wickramarachchi 2020, Gaur 2018] Search [Sheth 2003, Cheekula 2015, Kho 2019] QA [Alambo 2019, Shekarpour 2017] [Stanford Knowledge Graph Seminar 2020, Luna Dong] Knowledge Graphs in DL pipeline for NLP
  17. 17. Knowledge graphs in Conversational AI 19 Personalization: taking into account the contextual factors such as user’s health history, physical characteristics, environmental factors, activity, and lifestyle. Chatbot with contextualized (e.g asthma) knowledge is potentially more personalized and engaging. Without Contextualized Personalization With Contextualized Personalization
  18. 18. Knowledge for Multimodal Data: Example of City Traffic Event 20Anantharam, Pramod, Payam Barnaghi, Krishnaprasad Thirunarayan, and Amit Sheth. "Extracting city traffic events from social streams." ACM Transactions on Intelligent Systems and Technology (TIST) 6, no. 4 (2015): 1-27.
  19. 19. Why Knowledge Graphs: Shortcomings of Deep Learning 21 ● Graph Convolutional Neural Networks (GCN) are blind to relation types. For example: <shelter- in-place causes anxiety> and <shelter-in-place prevents anxiety> have similar representations in GCN. ● Deep Clustering over unlabeled data exploits the inherent latent semantics to generate diverse and cohesive clusters. But, interpretability of the clusters requires Knowledge Graphs. ODKG: Opioid Drug Knowledge Graph [Kamdar 2019]
  20. 20. Symbolic glued with Statistical: Knowledge-infused Learning 22 STATISTICAL AI CONNECTIONIST “Unreasonable effectiveness of big data” in machine processing & powering bottom up processing “Unreasonable effectiveness of small data” in human decision making - can this be emulated to power top down processing? SYMBOLIC AI FORMAL KG will play an increasing role in developing hybrid neuro-symbolic systems (that is bottom-up deep learning with top-down symbolic computing) as well as in building explainable AI systems for which KGs will provide scaffolding for punctuating neural computing. Cognitive Science Analogy: Combining Top Brain - Bottom Brain Processes.
  21. 21. 2. Knowledge Graph meets Deep Learning: Knowledge-infused Learning 23
  22. 22. How do ensure consistency of labeling, esp when label is not binary? Do labels represent adequate semantics (e.g., number of alternatives)? Do they have adequate domain knowledge? How do you ensure consistency of labeling (interpretation)? 24 A good KG has addressed these issues: ● a schema is rich in representation (and captures much more than labeling) ● KG design incorporate substantiate domain knowledge ● Instance level knowledge is created through (usually) collective intelligence and Challenges in Deep Learning : Why K-IL
  23. 23. Why Knowledge Infused Learning (K-IL)? By changing the inputs, it can enrich the representation (E.g. Radicalization on Social Media) By changing parameters, we can control the learned patterns/correlations learned to adhere to the knowledge. Deep Infusion would allow us finger grained control over learned patterns to ensure adherence to knowledge at every step of the hierarchy Explanations easy to derive from the KG used 25Jiang, Shan, William Groves, Sam Anzaroot, and Alejandro Jaimes. "Crisis Sub-Events on Social Media: A Case Study of Wildfires." Contextual Modeling to analyze Radicalization on Social Media
  24. 24. 26 Knowledge-infused Learning (K-IL) of knowledge graphs to improve the semantic and conceptual processing of data. Semi-Deep Infusion Deeper and congruent incorporation or integration of the knowledge graphs in the learning techniques. Deep Infusion (Part of Future KG Strategy) combines statistical AI (bottom-up) and symbolic AI learning techniques (top- down) for hybrid and integrated intelligent systems. Shallow Infusion Sheth, Gaur, Kursuncu, Wickramarachchi: Shades of Knowledge-Infused Learning for Enhancing Deep Learning
  25. 25. 27 Shallow Infusion of Knowledge for Machine/ Deep Learning in Brief Chronological arrangement of shallow Infusion techniques From NLP domain
  26. 26. 28 K-IL: Shallow Infusion (shallow KR, shallow merging technique) Knowledge infused is shallow, method of infusion is week. Shallow external knowledge is described as those form of information which are extracted from text based on some heuristics, often designed for task-specific problems: ○ Bag of Words/Phrases from Corpus [Hagoort 2004, Zhang 2019, Sun 2019] ○ Bag of Words/Phrases from Semantic Lexicons [Faruqui 2014, Mrkšić 2016] ○ Count of Nouns, Pronouns, Verbs [Gkotsis 2017, 2016] ○ Sentiment and Emotions of the sentence [Gaur 2019, Vedula 2017, Kursuncu 2019] ○ Latent topics describing the documents [Jiang 2016, Li 2016, Meng 2020] ○ Label assignment to words or phrases in sentence (Semantic Role Labeling): Mary sold the book to John Agent ThemePredicate Recipient
  27. 27. 29 K-IL: Shallow Infusion: Explaining Clustering Identifiable Suicide Risk Factors from Electronic Healthcare Records Identifiable Suicide Risk Factors from Social Media Question: What people say to Clinician? Question: What people hide from Clinician? Question: What people say to Social Media? Question: What people hide from Social MediaMissing Information
  28. 28. 30 K-IL: Shallow Infusion: Knowledge Graph Embeddings for Autonomous Driving Scene KG KG Embeddings of objects/events Computed Scene Similarity Wickramarachchi, Ruwan., Henson, Cory., and Sheth, Amit. An evaluation of knowledge graph embeddings for autonomous driving data: Experience and practice. In AAAI 2020 Spring Symposium on Combining Machine Learning and Knowledge Engineering in Practice (AAAI-MAKE 2020).
  29. 29. 32 K-IL: Semi-Deep Infusion
  30. 30. 33 K-IL: Semi-Deep Infusion : Matching Reddit Conversation to DSM-5 Scenario Really struggling with my bisexuality which is causing chaos in my relationship with a girl. Being a fan of LGBTQ community, I am equal to worthless for her. I’m now starting to get drunk because I can’t cope with the obsessive, intrusive thoughts, and need to get out of my head. BPD DICD PND SAD SBI OCD Don’t want to live anymore. Sexually assault, ignorant family members and my never ending loneliness brights up my path to death. SCW PND SBI SAD DPR DICD DPR I do have a potential to live a decent life but not with people who abandon me. Hopelessness and feelings of betrayal have turned my nights to days. I am developing insomnia because of my restlessness. SBI DPR DICD BPD I just can’t take it anymore. Been abandoned yet again by someone I cared about. I've been diagnosed with borderline for a while, and I’m just going to isolate myself and sleep forever. SBI PND Reddit DSM-5 [Gaur 2018]
  31. 31. 34 TwADR AskaPatient Drug Abuse Ontology DSM-5 Lexicon Suicide Risk Severity Lexicon Treatment Information Observation and Drug-related Information Mental Health Condition Suicide Risk Levels Ideation Behavior Attempt K-IL: Semi-Deep Infusion : Matching Reddit Conversation to DSM-5 Mapping Subreddit to DSM-5 categories using Mental health Knowledge Bases
  32. 32. 35 Medical KnowledgeBases N-grams (n=1, 2, 3) LDA LDA over Bi-grams Normalized Hit Score DSM-5 Lexicon <Reddit Post> <Subreddit Label> Input <Reddit Post> <DSM-5 Label> Output DAO Drug Abuse Ontology K-IL: Semi-Deep Infusion : Matching Reddit Conversation to DSM-5 Matching process from Reddit to DSM-5
  33. 33. 36http://www.papersfromsidcup.com/graham-daveys-blog/changes-in-dsm-5 K-IL: Semi-Deep Infusion : Matching Reddit Conversation to DSM-5 Outcome of Mapping Subreddit to DSM-5 categories
  34. 34. 37 12808 Words 300 dimension embedding 300 dimension embedding 20 DSM-5 Categories R D Reddit Word Embedding Model DSM-5 -DAO Lexicon W Solvable Sylvester Equation K-IL: Semi-Deep Infusion : Matching Reddit Conversation to DSM-5
  35. 35. 38 I know you want me to say no and that it is a part of me blah blah blah. But I can't. Honestly, not having bipolar disorder would be a huge blessing. I would be so much happier and could control my life better. I wouldn't have frantic, scattered thoughts and depression. I would be normal, happy, and less dramatic. Bipolar Subreddit DSM-5: Depressive Disorder I know you want me to say no and that it is a part of me blah blah blah. But I can't. Honestly, not having bipolar disorder would be a huge blessing. I would be so much happier and could control my life better. I wouldn't have frantic, scattered thoughts and depression. I would be normal, happy, and less dramatic. BiPolar Depression Disorder Subreddits DSM-5 Chapter BiPolarReddit BiPolarSOS Depression Addiction Substance use & Addictive Disorder Crippling Alcoholism Opiates Recovery Opiates Self-Harm Stop Self-Harm K-IL: Semi-Deep Infusion : Matching Reddit Conversation to DSM-5 Example posts after Mapping Subreddit to DSM-5 categories Mappings provides explainability
  36. 36. 39 K-IL: Semi-Deep Infusion : Matching Reddit Conversation to DSM-5 Domain-specific Knowledge lowers False Alarm Rates. 2005-2016 550K Users 8 Million Conversations 15 Mental Health Subreddits [Gkotsis 2017][Saravia 2016] [Park 2018] Performance Gains in the outcomes
  37. 37. Semi-deep infusion in Reinforcement Learning 40 Consider a gathering event at a rally [Tablighi Jamaat Movement] Many fatalities and economic cost incurred before an SIR model recognises this event (delay) Any policy by the policy maker at this point might be too late to instate. A Knowledge infused policy where the knowledge is - [lockdown the location of rally and test everyone,] can greatly mitigate this effect. Image taken from: https://towardsdatascience.com/reinforcement-learning-for-covid-19-simulation-and-optimal-policy-b90719820a7f How?
  38. 38. 3. K-IL in Explainability and Interpretability in Healthcare 42
  39. 39. An Explainable system would comprise of collectively exhaustive interpretable subsystems and orchestration among them. Explanations would be in natural language explaining the decision making process. Interpretable system provides an ability to discern the internal mechanisms of any module. Neural Attention Models are endowed with certain degree of interpretability in visualizing parts of the input without providing human understandable explanations. Explainable System is Interpretable but not vice versa
  40. 40. 44 Really struggling with my bisexuality which is causing chaos in my relationship with a girl. Being a fan of LGBTQ community, I am equal to worthless for her. I’m now starting to get drunk because I can’t cope with the obsessive, intrusive thoughts, and need to get it out of my head. Is mental health related ? Yes: 0.71 , No: 0.29 Which Mental Health condition? Predicted: Depression (False) True: Obsessive Compulsive Disorder Reasoning over Model: Why model predicted Depression? Unknown
  41. 41. 45 Really struggling with my bisexuality which is causing chaos in my relationship with a girl. Being a fan of LGBTQ community, I am equal to worthless for her. I’m now starting to get drunk because I can’t cope with the obsessive intrusive thoughts, and need to get it out of my head. Is mental health related ? Yes: 0.82 , No: 0.18 Which Mental Health condition? Predicted: Obsessive Compulsive Disorder(True) True: Obsessive Compulsive Disorder DSM-5 Knowledge Graph DSM-5 and Post Correlation Matrix Reasoning over Model: Why model predicted Obsessive Compulsive Disorder ? known Interpretable learningD εRN P εRN W f(W)
  42. 42. 46 Really struggling with my bisexuality which is causing chaos in my relationship with a girl. Being a fan of LGBTQ community, I am equal to worthless for her. I’m now starting to get drunk because I can’t cope with the obsessive, intrusive thoughts, and need to get out of my head. 288291000119102: High risk bisexual behavior 365949003: Health-related behavior finding 365949003: Health-related behavior finding 307077003: Feeling hopeless 365107007: level of mood 225445003: Intrusive thoughts 55956009: Disturbance in content of thought 26628009: Disturbance in thinking 1376001: Obsessive compulsive personality disorder Multi-hop traversal on Medical knowledge graphs <is symptom> Achieving Explainability through Medical Entity Normalization : Replacing Entities in the post with Concepts in the Medical Knowledge Graph through Semantic Annotation
  43. 43. 47 Really struggling with my [health-related behavior] which is causing [health-related behavior] with a girl. Being a fan of [LGBTQ] community, I am equal to [level of mood] for her. I’m now starting to [drinking] because I can’t cope with the [obsessive compulsive personality disorder] [disturbance in thinking], and [disturbance in thinking]. Is mental health related ? Yes: 0.96 , No: 0.04 Which Mental Health condition? Predicted: Obsessive Compulsive Disorder(True) True: Obsessive Compulsive Disorder DSM-5 Knowledge Graph DSM-5 and Post Correlation Matrix Reasoning over Model: Why model predicted Obsessive Compulsive Disorder ? known Interpretable and Explainable Learning D εRN P εRN W f(W)
  44. 44. 4. K-IL for Explainability and Interpretability in Adaptive Contagion Control 48
  45. 45. Semi-deep infusion in RL 49 Consider a gathering event at a rally [Tablighi Jamaat Movement], Many fatalities and economic cost incurred before an SIR model recognises this event (delay) Any policy by the policy maker at this point might be too late to instate. A Knowledge infused policy where the knowledge is - [lockdown the location of rally and test everyone,] can greatly mitigate this effect. Image taken from: https://towardsdatascience.com/reinforcement-learning-for-covid-19-simulation-and-optimal-policy-b90719820a7f How?->
  46. 46. Explainable COVID-19 Policy ◎ Knowledge in dynamics: People go to work everyday and do groceries at either shops in the neighborhood or shops en-route to work. ◎ Knowledge traceable in policy choice: “There exists a ‘shop1’ en-route to a workplace, there are many people in a neighborhood that work here and take this route” -> encoded as a relational feature ◎ Learning algorithm assigns high weight to this feature when the policy output is lockdown(shop1) 50
  47. 47. 5. K-IL : Explainable Improving of Learning Outcomes 51
  48. 48. Bayesian Knowledge Tracing for Improving Learning Outcomes in Education 53 Question: What is the name of the compound formed after the addition of phosphate to glucose? Answer: Glucose Monophosphate Response from Student: Glucose Phosphate Question: What is the name of the compound formed after the addition of phosphate to adenosine diphosphate? Answer: Adenosine Triphosphate Response from Student: Adenosine 3-Phosphate Can we conclude from the correct responses (if any) provided by the student, that student knows Phosphorylation? Piech, Chris, Jonathan Bassen, Jonathan Huang, Surya Ganguli, Mehran Sahami, Leonidas J. Guibas, and Jascha Sohl-Dickstein. "Deep knowledge tracing." In Advances in neural information processing systems, pp. 505-513. 2015. Using Knowledge infusion, we can see the answer is close to correct
  49. 49. KG + BKT/DKT → Explainability 54 CQ: Concepts in the questions asked CQ, CQ: Relationships between the concepts asked in the questions CQ, CKG: Relationships between the concepts asked in the questions and the concepts in the Knowledge graphs (e.g. epubs from Amazon, NCERT textbooks, Books specific to entrance exams, etc.)
  50. 50. 55 Donda, Chintan, Sayan Dasgupta, Soma S. Dhavala, Keyur Faldu, and Aditi Avasthi. "A framework for predicting, interpreting, and improving Learning Outcomes." arXiv preprint arXiv:2010.02629 (2020). K-IL for Improving Learning Outcomes Tutorial @ ACM CoDS COMAD https://aiisc.ai/xaikg/
  51. 51. 6. Conclusion 56
  52. 52. Knowledge-infused Learning
  53. 53. ROBOTICS Cross-domain Knowledge 1) Observational (sensory data) and common-sense knowledge to perceive the surrounding environment 2) Knowledge representation to model the knowledge concerning the surrounding environment 3) Appropriate cross- domain knowledge reasoning mechanisms COGNITIVE SCIENCE Human Intelligence “Inject” human intelligence into AI assistants such as Amazon Alexa, utilization of cross- domain knowledge of social interactions, emotions and linguistic variations of natural language. SELF-DRIVING CARS PERSONAL ASSISTANT Empathy and Morality AI agents to mimic human emotions and decisions, we need to model human emotional knowledge of empathy, moral, and ethics. Personalization Smart health agents are adapting to answer real- world personalized complex health queries in simple interactive language. Requires patients’ environmental knowledge, health data, and coordination with their healthcare physicians. Promising K-IL Impacts
  54. 54. Knowledge-infused Learning Methods (AAMAS’21, ICPR’21, COLING’20, Internet Computing’19, AAAI’20, CIKM’18, WWW’19, NAACL’18, ACL’17) [Stanford Knowledge Graph Seminar 2020, Christopher Re] Where are we?
  55. 55. 60 5 faculty, >12 PhDs, few Masters, >5 undergrads, 2 Post-Docs, >10 Research Interns Alumni in/as Industry: IBM T.J. Watson, Almaden, Amazon, Samsung America, LinkedIn, Facebook, Bosch Start-ups: AppZen, AnalyticsFox, Cognovi Labs Faculty: George Mason, University of Kentucky, Case Western Reserve, North Carolina State University, University of Dayton Core AI Neuro-symbolic computing/Hybrid AI, Knowledge Graph Development, Deep Learning, Reinforcement Learning, Natural Language Processing, Knowledge-infused Learning (for deep learning and NLP), Multimodal AI (including IoT/sensor data streams, images), Collaborative Assistants, Multiagent Systems (incl. Coordinating systems of decision making agents including humans, robots, sensors), Semantic-Cognitive- Perceptual Computing, Brain-inspired computing, Interpretation/Explainability/Trust/Ethics in AI systems, Search, Gaming Interdisciplinary AI and application domains: Medicine/Clinical, Biomedicine, Social Good/Harm, Public Health (mental health, addiction), Education, Manufacturing, Disaster Management
  56. 56. Thanks! Open to Questions? You can find me at: amit@sc.edu https://aiisc.ai/ https://www.linkedin.com/company/1054055/ http://bit.ly/AIISC 61

Editor's Notes

  • Slide 3: Inner circle : talks about our research areas and strength
  • A nice knowledge graph, which is a knowledge graphs ----- picture over here

    One side ---> Empathi
    ezDI image → other side

    When an agent communicate with humans,
    Empathy, policies, trustworthy → inform the behavior of the agent
  • ---- Slide before PAC Learning

    ----- Explaining one of the them -- why knowlege graph would help

    There are many NLP challenges, why knowledge Graph would work
    GPT-3 --- issues
    Can KG solve it

    How to get a better context for effective output
  • You can have relationship between concepts
  • Video : When the slide will be uploaded.
  • a(i) Domain knowledge of traffic in the form of concepts and relationships (mostly causal) from the ConceptNet
    a(ii) Probabilistic Graphical Model (PGM) that explains the conditional dependencies between variables in traffic domain is enriched by adding the missing random variables, links, and link directions extracted from ConceptNet
    b : Shows how this enriched PGM is used to correlate contextually related data of different modalities.

    3 sources of knowledge (Geo-Spatially and temporally) [ We need to put this in the slide]
    → Open Street Map
    → Smart City Knowledge Graph
    → [Find the third one]
  • ---- We should have another insight:
    --- it is still a coarse-grained use of kG for making sense of clusters
    --- We need to provide an example in a more detailed: There is an explicit relationship between two concept
    ---- A person owns a company or works for a company

    Both Example
    Kaushik: points
    Manas: points (enriching the embedding)
  • What is knowledge infusion in deep learning? Using knowledge to change input (shallow), to change parameters (semi-deep), to change parameters by mapping to a stratified hierarchy (Deep) (Ex: 1st layer knowledge x, 2nd layer knowledge y, etc). Can use diagram from pydata Berlin talk.
  • → Integration of Knowledge Representation with Statistical Representation of Text is also straightforward → Devoid of Semantic Representation
    → Shallow merging needs to be demonstrated
  • Possible proposal material
  • In semi-deep infusion paradigm, the learning system of the model is altered either through a probabilistic threshold (e.g. attention or constraints) or data redundancy for gains in performance. There are three broad categories of SEMI-DEEP Infusion:
    Forcing methods: the prediction of the model from the learnt representation is improved by mixing (sigmoidal, concatenation, multiplication) the representation of input as ground truth to enrich latent representation.
    Attention methods: These methods improves upon the forcing methods by making the model capable of selecting parts of the learnt representations that needs to be modified.
    Knowledge-base methods: Since both forcing and attention methods rely on the input data which is a poor manifestation of the real world, thus models suffer from problems such as exposure bias. The knowledge-base methods replace the dependency of the model from input text to knowledge-base for attention and forcing.
    In knowledge-based LSTMs, rather than putting attention on input text, the method used attention as a switch, which when open contextualize the latent representation through representations of relevant concepts in knowledge base. When the switch is off, the latent representation is used as it is.
    In knowledge-based GANs, the model learn by maximizing the reward, which is generating the representation of the input which matches the output. One way of formulating this reward is minimization of KL divergence. In this architecture, the attention module is influenced by reward function which is a learnable constraint.
  • r/BPD: Borderline PErsonality Disorder
    SBI: Suicide Behavior Ideation
  • Correlation matrix is the parameters for the deep learning algorithm for DSM-5
  • Method: Semantic encoding - decoding optimization

    (Pearson Correlation)
    DD: Correlation between DSM-5 Categories

    RR: Correlation between concepts in Reddit posts irrespective of the user

    DR: Correlation between the concepts in Reddit

  • Qualitatively, this is the outcome of the semantic encoding and decoding method.

    You are able to label a post in a subreddit with an appropriate DSM-5 category.

    On the Left, is all such mapping that the model learnt.
  • Why? Shallow - Can help enrich neural representations. Semi-deep: Can help with tweaking parameters to follow correlations present in knowledge (in addition to data) in constructing representations. Deep - Can identify what correlation in the knowledge in addition to data matters in which layer to finally construct a representation that benefits from knowledge infusion at all layers. Ex: Shallow: Wikipedia based GNN training to answer questions - hopefully captures relationships. Semi-deep: Force understanding that Obama is correlated to Michele Obama through relationships like spouse, by explicitly modifying the attention (correlation matrix) - definitely captures relationships. Deep: Identify number relationships, how they relate to metrics, how those metrics relate to what is being measured (blood pressure), how blood pressure relates to what is being predicted - definitely captures nested/hierarchical relationship semantics

    <Example of Deep Knowledge Infusion>
  • Definition of Interpretability and Explainability
  • Multi-hop
    Two-hop
    Changing the post
  • Example of explanation: For each time of knowledge infusion
    Shallow: TSNE clusters can show that KG relationships were captured, sports words come together
    Semi-Deep: Attention matrix can show if KG relationships were captured, sports words attend to each other with high correlation
    Deep: Representations at each layer can be visualized through concept maps in the stratified KG. Members of a hierarchical concept lower in the hierarchy correlate highly with those higher in the hierarchy on visualization of concepts from a class hierarchy. (Ex: 30, cistolic pressure, heart attack all would be close as they map to the same hierarchical concept)
  • Explainability example in Education
  • Current approach assessing the mastery of a student in a course and provide multiple pathways for improving the learning outcomes relies on a predictive algorithm: Bayesian Knowledge Tracing (BKT).

    The approach assess following tendencies of the student:
    He knows the answers correctly
    He guessed the answers correctly
    What is the improvement after multiple attempts

    However, it does not tell:
    How far from the correct answer, is the student’s answer?
    What relevant concepts the student needs to learn?
    Also, the algorithm does not provide the capability to assess whether the student has mastered a topic in a course or course itself.

    On this slide, a student was asked two questions from the topic of “Phosphorylation”.
    BKT would consider these questions independently, Whereas, Knowledge infusion would find the relation between the two question, through the entity: Phosphorylation
    Since, the answer don’t match the true answer, BKT would not accept them as correct.
    The question in the red, could not be answered by BKT, because it does not know the relation between the questions
    However, if we use knowledge Infusion:
    It knows the relation between the concepts through Phosphorylation, so, it can answer the question in red.
    It knows that “adenosine 3-phosphate” is an alias of “Adenosine Triphosphate”, so it would accept the response
    It would measure the distance between “Glucose Phosphate” and “Glucose Monophosphate” to see:
    How far from the correct answer is the student’s answer?
    What new concepts the student needs to learn to achieve mastery on this topic
  • The concepts asked in the question are addition of phosphate to glucose and addition of phosphate to adenosine diphosphate
    From the KG, the relation between the two is obtained as relating to phosphorylation
    The answer the student provides which is adenosine 3-phosphate might be predicted as wrong by the DN (because it is not adenosine tri-phosphate). The wrong answer triggers search through the KG to figure out how far from the right answer.
    The explanation adenosine tri-phosphate is an alias of adenosine 3-phosphate and therefore the explanation shows that the student was actually correct and hence has attained explainable mastery.

    Education knowledge graph can be constructed using the content from MOOC, Coursera, Khan Academy, Udemy, Udacity, Books, epubs from Amazon

    https://khanacademy.fandom.com/wiki/Knowledge_Map
  • Bayesian knowledge tracing not adequate as explanations required to know what other concepts the student might need to attain mastery
    These concepts can be found in the KG
    Furthermore, the KG can provide explanation for how far the current level is from mastery.
  • Do we need to provide a list of workshop and tutorials conducted
  • ×