Successfully reported this slideshow.

PyData Salamanca knowledge infusion in healthcare

0

Share

Upcoming SlideShare
Knowledge-infused AI
Knowledge-infused AI
Loading in …3
×
1 of 37
1 of 37

PyData Salamanca knowledge infusion in healthcare

0

Share

Download to read offline

The talk describes a paradigm of knowledge-infused learning in healthcare for explainability, interpretability, and traceability of outcome. Thus bridging the gap between AI and Clinical settings and developing architectures that are of clinical relevance.

The talk describes a paradigm of knowledge-infused learning in healthcare for explainability, interpretability, and traceability of outcome. Thus bridging the gap between AI and Clinical settings and developing architectures that are of clinical relevance.

More Related Content

Similar to PyData Salamanca knowledge infusion in healthcare

Related Books

Free with a 14 day trial from Scribd

See all

Related Audiobooks

Free with a 14 day trial from Scribd

See all

PyData Salamanca knowledge infusion in healthcare

  1. 1. Knowledge-infused Learning in Healthcare Manas Gaur https://manasgaur.github.io mgaur@email.sc.edu Artificial Intelligence Institute Sincere thanks to Noun Project for making their icons freely available. They have been used while making this presentation
  2. 2. Outline ● Motivation for Knowledge-infused Learning (K-IL) ○ Definition ○ How do we use Knowledge Graphs ○ Mathematical background ○ Types of K-IL ○ Models, Evaluation, and Applications ● Knowledge-infused Learning for Healthcare ○ Challenges ○ Web-based Intervention (Reddit → DSM-5)
  3. 3. Definition: Knowledge Graphs Knowledge Graph (KG) is a structured knowledge in a graphical representation. Enhanced semantic applications such as search, browsing, personalization, recommendation, advertisement, summarization. Problems: Data Sparsity and Ambiguity Different forms: ● Ontology : Knowledge graph after human curation of entities and relations ● Knowledge Base: flattened graph ● Lexicons: Small application-specific flattened graph Examples of General Purpose Knowledge Graphs* 1. DBpedia 2. Yago 3. Freebase 4. ConceptNet 5. Knowledge Vault 6. NELL 7. Wikidata Example of Healthcare-specific Knowledge Graphs 1. SNOMED-CT 2. Unified Medical Language System (UMLS) 3. DataMed 4. International Classification of Diseases (ICD-10) 5. Rx-NORM 6. DrugBank 7. Drug Abuse Ontology 8. Medical Dictionary for Regulatory Activities http://www-sop.inria.fr/members/Freddy.Lecue/presentation/ISWC2019-FreddyLec ue-Thales-OnTheRoleOfKnowledgeGraphsInExplainableAI.pdf https://datamed.org/APIDoc.php Cameron, Delroy, Gary A. Smith, Raminta Daniulaityte, Amit P. Sheth, Drashti Dave, Lu Chen, Gaurish Anand, Robert Carlson, Kera Z. Watkins, and Russel Falck. "PREDOSE: a semantic web platform for drug abuse epidemiology using social media." Journal of biomedical informatics, 2013.
  4. 4. Examples Commonsense Reasoning Graph Drug Abuse Ontology Event Ontology Crisis Ontology
  5. 5. Personalization: taking into account the contextual factors such as user’s health history, physical characteristics, environmental factors, activity, and lifestyle. Chatbot with contextualized (asthma) knowledge is potentially more personalized and engaging. Without Contextualized Personalization With Contextualized Personalization How do we use Knowledge Graphs?
  6. 6. Assessing Mental Health Impact of COVID using News Articles How do we use Knowledge Graphs? https://theconversation.com/were-measuring-online-conversation-to-track-the-social-and-mental-health-issues-surfacing-during-the-coronavirus-pandemic-135417 Multilingual KG http://conceptnet.io/ GDelt Database https://www.gdeltproject.org/
  7. 7. Semantic Proximity GBV Index GBV estimation for 14 days GBV Lexicon from Tweets on bullying, abuse. Domestic violence, etc. Mapping words to categories for expansion of lexicon Generic Knowledge Graph of Wikipedia Aligning the lexicon words and new entities with respect to DBpedia Categories Enriched Lexicon for gathering abstract meaning of GBV in tweets Calculating cosine similarity between two vectors (GBV and Tweets) and setting empirical threshold on semantic proximity 4 Weeks of Mental Health Tweets From March 14-April 04 Analyzing Gender-based Violence (GBV) in Mental Health COVID-19 Twitter Conversation How do we use Knowledge Graphs? Maximum A Posteriori Estimation (MAP) Purohit, Hemant, Tanvi Banerjee, Andrew Hampton, Valerie L. Shalin, Nayanesh Bhandutia, and Amit P. Sheth. "Gender-based violence in 140 characters or fewer: A# BigData case study of Twitter." arXiv preprint arXiv:1503.02086 (2015).
  8. 8. Definition:Knowledge-infused Learning (K-IL) K-IL: “The exploitation of domain knowledge and application semantics to enhance existing deep learning methods by infusing relevant conceptual information into a statistical, data-driven computational approach (Neuro-Symbolic AI).” A. Sheth, M. Gaur, U. Kursuncu and R. Wickramarachchi, "Shades of Knowledge-Infused Learning for Enhancing Deep Learning," in IEEE Internet Computing, vol. 23, no. 6, pp. 54-63, 1 Nov.-Dec. 2019, doi: 10.1109/MIC.2019.2960071.
  9. 9. Valiant, Leslie G. "Robust logics." Artificial Intelligence 117.2 (2000): 231-253. K-IL: Probably Approximately Correct Learning
  10. 10. How do you know that a training set has a good domain coverage? Robust Classifier → Low Generalizability Error Consistent Classifier → Low Training Error Confidence: More Certainty (lower δ) means more number of samples. Complexity: More complicated hypothesis (|H|) means more number of samples PAC Learning
  11. 11. Challenge: Existing ML Models: Infusion: True Data Distribution Hypothesis Data Distribution Definition: Knowledge Infusion
  12. 12. Dataset enrich Deep Learning Model Tacit Knowledge Hypothesis testing or similarity-based verification Shallow Infusion Tacit Knowledge Self-aware or External Knowledge Self-aware or External Knowledge Similarity based verification Semi-Deep Infusion Dataset Types of K-IL Deep Learning Model
  13. 13. K-IL: Shallow Infusion Sheth, Amit, Manas Gaur, Ugur Kurşuncu, and Ruwan Wickramarachchi. "Shades of knowledge-infused learning for enhancing deep learning." IEEE Internet Computing 23, no. 6 (2019): 54-63.
  14. 14. K-IL Semi-Deep Infusion Sheth, Amit, Manas Gaur, Ugur Kurşuncu, and Ruwan Wickramarachchi. "Shades of knowledge-infused learning for enhancing deep learning." IEEE Internet Computing 23, no. 6 (2019): 54-63.
  15. 15. Types of K-IL Deep Infusion (Vision)
  16. 16. K-IL : Models Long Short Term Memory Variants: 1. Knowledge base at each LSTM cell [1]. 2. K-IL layer: a. 1D Convolutional Neural Network for mixing b. Graph Convolutional Neural Network -- When hierarchical structure of KG is important and need to be preserved in representation. c. Simple Multi-layer Perceptron [2] [1] Yang, Bishan, and Tom Mitchell. Leveraging knowledge bases in lstms for improving machine reading. arXiv preprint arXiv:1902.09091 (2019). [2] Kursuncu, Ugur, Manas Gaur, and Amit Sheth. Knowledge Infused Learning (K-IL): Towards Deep Incorporation of Knowledge in Deep Learning. arXiv preprint arXiv:1912.00512 (2019).
  17. 17. K-IL : Models Generative Adversarial Network* *Chang, Che-Han, Chun-Hsien Yu, Szu-Ying Chen, and Edward Y. Chang. "KG-GAN: Knowledge-Guided Generative Adversarial Networks." arXiv preprint arXiv:1905.12261 (2019). Seen Category Data UnSeen Category Data Generator (G1 ) Generator (G2 ) Z1 Z2 Real Data Fake Data (G1 ) Fake Data (G2 ) Discriminator (D) Embedding Regression Network Semantic Embedding of Unseen Category Prediction (G2 ) Prediction (G1 ) ≅ Parameter Sharing Loss (G1 ) Loss (G2 ) Real or Fake Objective Function
  18. 18. K-IL : Objective Functions and Evaluation Kullback Leibler Divergence ● Measures the Information loss during the learning phase between Latent/hidden states and KGs ● KG Embeddings: TransE, HoIE etc. ● Models: Variational Autoencoders, LSTMs, GANs, Siamese Neural Networks ● Frameworks: Zero Shot Learning , One Shot Learning, Transfer Learning, Parameter Sharing ● Other Variants: Jensen Divergence, Regularization, Integer Linear Programming Kosheleva, Olga, and Vladik Kreinovich. "Why deep learning methods use KL divergence instead of least squares: a possible pedagogical explanation." Математические структуры и моделирование 2 (46) (2018). Evaluation: Before and After Knowledge-infusion Methods (Apart from Precision, Recall, F1-score): ● Frechet Inception Distance : measure of similarity between two datasets (KG & Training Data) ● Statistical Significance Hypothesis Testing ● Word and Concept Features ● T-SNE Visualization of Clusters ● Area under perturbation curve: Feature Ranking ● Human-centric evaluation: Crowdsourcing, User Satisfaction, Mental Model, Trust Assessment, Correctability OF EV http://www-sop.inria.fr/members/Freddy.Lecue/presentation/ISWC2019-FreddyLecue-Thales-OnTheRoleOfK nowledgeGraphsInExplainableAI.pdf
  19. 19. Utility of K-IL: Applications Summarization of Clinical Diagnostic Interviews Faruqui, Manaal, Jesse Dodge, Sujay K. Jauhar, Chris Dyer, Eduard Hovy, and Noah A. Smith. "Retrofitting word vectors to semantic lexicons." arXiv preprint arXiv:1411.4166 (2014). Gaur, Manas, et al. "" Let Me Tell You About Your Mental Health!" Contextualized Classification of Reddit Posts to DSM-5 for Web-based Intervention." 27th ACM CIKM 2018..
  20. 20. Utility of K-IL: Applications BERT Abstractive Summarization using Integer Linear Programming (ILP) Abstractive Summarization using ILP and PHQ-9 Statistical Statistical + Constraints Statistical + Constraints + Knowledge Summarization of Clinical Diagnostic Interviews
  21. 21. Utility of K-IL: Applications SuicideWatch Subreddit (93K Users) NYC CDRN EHR (123K patients) Data specific to Mental Health Medical Knowledge Bases Association between Social Media and EHR in Suicide-related Communications We identified self-harm, depressive feelings, and suicide ideations as latent topics expressed in Reddit and EHR data. Both sources did not provide evidence of mentions or expressions of impulsivity, family violence, and drug abuse.
  22. 22. K-IL for Healthcare Challenges Classification of Reddit posts to DSM-5 (Reddit → DSM-5)
  23. 23. K-IL: 2 Challenges in Healthcare ContextualizationContextualization User-level, across different sources (forums, subreddits) where user has posted I dont think I have thought about it every day of my entire life. I have for a good portion of it however my boyfriend may be able to determine whether I’m worth his time S5 I dont think I have thought about it every day of my entire life. I have for a good portion of it, however, my boyfriend may be able to determine whether I’m worth his time S5 Having a plan for my own suicide has been a long time relief for me as well. I more often than not wish I were dead. S8 Predicted label: Suicide Indication Predicted label: Suicide Ideation
  24. 24. K-IL: 2 Challenges in Healthcare Contextualization and Abstraction User’s original posts I have found myself mired in a similar situation as your boyfriend - addicted to the internet. It sounds like he its hurting a lot and needs your help in changing his habits I have found myself mired in a similar situation as your boyfriend - Drug abuse to the internet. It sounds Hyperactive behavior he its Depressed mood a lot and needs your help in changing his habits. SW SSH SLH BPD DPR ADD SCZ BPR The transient posting of potential suicidal users in other subreddits, requires careful consideration to appropriately predict their suicidality. Hence, we analyze their content by harnessing their network and bringing their content if it overlaps with other users within SW. We found, Stop Self Harm (SSH) > Self Harm (SLH) > Bipolar (BPR) > Borderline Personality Disorder (BPD) > Schizophrenia (SCZ) > Depression (DPR) > Addiction (ADD) > Anxiety (ANX) to be most active subreddits for suicidal users. After aggregating their content, we perform MedNorm using Lexicons to generate clinically abstracted content for effective assessment. DSM-5 SNOMED-CT ICD-10DataMed Drug Abuse Ontology TwADR AskaPatient Mental Health and Drug Abuse Knowledge Base Clinically Abstracted User’s Posts ANX 1.0 0.66 0.40 0.40 0.44 0.39 0.30 0.34 SW: SuicideWatch subreddit
  25. 25. K-IL: Social Media Data to EHR Data TwADR AskaPatient Drug Abuse Ontology DSM-5 Lexicon Suicide Risk Severity Lexicon Treatment Information Observation and Drug-related Information Mental Health Condition Suicide Risk Levels Ideation Behavior Attempt
  26. 26. Reddit → DSM-5 Task I know you want me to say no and that it is a part of me blah blah blah. But I can't. Honestly, not having bipolar disorder would be a huge blessing. I would be so much happier and could control my life better. I wouldn't have frantic, scattered thoughts and depression. I would be normal, happy, and less dramatic. Bipolar Subreddit DSM-5: Depressive Disorder I know you want me to say no and that it is a part of me blah blah blah. But I can't. Honestly, not having bipolar disorder would be a huge blessing. I would be so much happier and could control my life better. I wouldn't have frantic, scattered thoughts and depression. I would be normal, happy, and less dramatic. BiPolar Depression Disorder Subreddits DSM-5 Chapter BiPolarReddit BiPolarSOS Depression Addiction Substance use & Addictive Disorder Crippling Alcoholism Opiates Recovery Opiates Self-Harm Stop Self-Harm
  27. 27. Main Post Comment Reply Subreddit Reddit 2005-2016 550K Users 8 Million Conversations 15 Mental Health Subreddits
  28. 28. Reddit → DSM-5 Mapping Medical Knowledge Bases N-grams (n=1, 2, 3) LDA LDA over Bi-grams Normalized Hit Score DSM-5 Lexicon <Reddit Post> <Subreddit Label> Input <Reddit Post> <DSM-5 Label> Output DAO Drug Abuse Ontology
  29. 29. SEDO Semantic Encoding and Decoding Optimization. It is a procedure to modulate word embedding (vectors) of a word. Reddit with DSM-5 labels Word Embedding Model Correlation Matrix (Q)over word vectors Medical Knowledge Bases Domain Experts Correlation Matrix (P) over DSM-5 Lexicon or DAO SEDO Optimize P, Q & Z DSM-5 Lexicon DSM-5 Vocabulary Matrix Word-modulated Word Embeddings DSM-5 Classification Cross Correlation Matrix (Z) between word vectors and DSM-5 Lexicon or DAO Linguistic Features DAO Reddit → DSM-5 Workflow
  30. 30. 12808 Words 300 dimension embedding 300 dimension embedding 20 DSM-5 Categories R D Reddit Word Embedding Model DSM-5 -DAO Lexicon W Solvable Sylvester Equation Reddit → DSM-5 Semantic Encoding and Decoding Optimization
  31. 31. Reddit → DSM-5 Encoding DSM-5 to Reddit embedding space Decoding Reddit to DSM-5 embedding space Semantic Encoding and Decoding Optimization
  32. 32. Domain-specific Knowledge lowers False Alarm Rates. Reddit → DSM-5 Outcome
  33. 33. Resources TwADR and AskaPatient Lexicon https://zenodo.org/record/55013#.XsYEH8YpBQI Ref: Limsopatham, Nut, and Nigel Collier. "Normalising medical concepts in social media texts by learning semantic representation." Association for Computational Linguistics, 2016. Suicide-Risk Severity Lexicon https://bit.ly/SRS_lexicon Ref: Gaur, Manas, Amanuel Alambo, Joy Prakash Sain, Ugur Kurşuncu, Krishnaprasad Thirunarayan, Ramakanth Kavuluru, Amit Sheth, Randy Welton, and Jyotishman Pathak. "Knowledge-aware assessment of severity of suicide risk for early intervention." In The World Wide Web Conference, 2019. DSM-5 and Drug Abuse Ontology Lexicon https://bit.ly/DSM5_DAO Ref: Gaur, Manas, Ugur Kurşuncu, Amanuel Alambo, Amit Sheth, Raminta Daniulaityte, Krishnaprasad Thirunarayan, and Jyotishman Pathak. "" Let Me Tell You About Your Mental Health!" Contextualized Classification of Reddit Posts to DSM-5 for Web-based Intervention." In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018. Suicide Risk Severity Dataset (Reddit) https://zenodo.org/record/2667859#.XsYH7MYpBQI Ref: Gaur, Manas, Amanuel Alambo, Joy Prakash Sain, Ugur Kurşuncu, Krishnaprasad Thirunarayan, Ramakanth Kavuluru, Amit Sheth, Randy Welton, and Jyotishman Pathak. "Knowledge-aware assessment of severity of suicide risk for early intervention." In The World Wide Web Conference, 2019.
  34. 34. K-IL: Where are we? And What’s happening? ● Current research on fusing background knowledge and deep learning focuses on: ○ Shallow Infusion ○ Semi-Deep Infusion ● Explainable AI in healthcare fall short in the involvement of Medical Knowledge graphs ● In Intelligent Virtual Assistants: ○ User engagement is a huge challenge ○ Requires Personalized Health Knowledge Graph ○ Motivational Interviewing: Open Question, Reflective Listening, and Summary ● K-IL Healthcare + X ○ Autonomous Driving Vehicle [1] ○ Cyber Social Threats [2] ○ Disaster Resilience System ○ Personal Finance https://www.gartner.com/en/research/methodologies/gartner-hype-cycle [1] Wickramarachchi, Ruwan, Cory Henson, and Amit Sheth. "An evaluation of knowledge graph embeddings for autonomous driving data: Experience and practice." arXiv preprint arXiv:2003.00344 (2020). [2] Kursuncu, Ugur, Manas Gaur, Carlos Castillo, Amanuel Alambo, Krishnaprasad Thirunarayan, Valerie Shalin, Dilshod Achilov, I. Budak Arpinar, and Amit Sheth. "Modeling Islamist Extremist Communications on Social Media using Contextual Dimensions: Religion, Ideology, and Hate." Proceedings of the ACM on Human-Computer Interaction, CSCW (2019)
  35. 35. Acknowledgement Dr. Amit P. Sheth Dr. Thirunarayan Krishnaprasad Dr. Valerie L. Shalin Dr.Jyotishman Pathak Dr.Ugur Kurşuncu
  36. 36. http://kiml2020.aiisc.ai/ http://aiisc.ai/

×