Successfully reported this slideshow.

Exploiting Multimodal Information for Machine Intelligence and Natural Interactions

0

Share

1 of 46
1 of 46

Exploiting Multimodal Information for Machine Intelligence and Natural Interactions

0

Share

Download to read offline

http://iwma.lnmiit.ac.in/speakers.html
Third International Workshop on Multimedia Applications ( IWMA ), March 02-06, 2021.

The Holy Grail of machine intelligence is the ability to mimic the human brain. In computing, we have created silos in dealing with each modality (text/language processing, speech processing, image processing, video processing, etc.). However, the human brain’s cognitive and perceptual capability to seamlessly consume (listen and see) and communicate (writing/typing, voice, gesture) multimodal (text, image, video, etc.) information challenges machine intelligence research. Emerging chatbots for demanding health applications present the requirements for these capabilities. To support the corresponding data analysis and reasoning needs, we have explored a pedagogical framework consisting of semantic computing, cognitive computing, and perceptual computing. In particular, we have been motivated by the brain’s amazing perceptive power that abstracts massive amounts of multimodal data by filtering and processing them into a few concepts (representable by a few bits) to act upon. From the information processing perspective, this requires moving from syntactic and semantic big data processing to actionable information that can be weaved naturally into human activities and experience.
Exploration of the above research agenda, including powerful use cases, is afforded in a growing number of emerging technologies and their applications - such as chatbots and robotics for healthcare. In this talk, I will provide these examples and share the early progress we have made towards building health chatbots that consume contextually relevant multimodal data and support different forms/modalities of interactions to achieve various alternatives for digital health. I will also demonstrate the strong role of domain knowledge and personalization using domain and personalized knowledge graphs as part of various reasoning and learning techniques.

http://iwma.lnmiit.ac.in/speakers.html
Third International Workshop on Multimedia Applications ( IWMA ), March 02-06, 2021.

The Holy Grail of machine intelligence is the ability to mimic the human brain. In computing, we have created silos in dealing with each modality (text/language processing, speech processing, image processing, video processing, etc.). However, the human brain’s cognitive and perceptual capability to seamlessly consume (listen and see) and communicate (writing/typing, voice, gesture) multimodal (text, image, video, etc.) information challenges machine intelligence research. Emerging chatbots for demanding health applications present the requirements for these capabilities. To support the corresponding data analysis and reasoning needs, we have explored a pedagogical framework consisting of semantic computing, cognitive computing, and perceptual computing. In particular, we have been motivated by the brain’s amazing perceptive power that abstracts massive amounts of multimodal data by filtering and processing them into a few concepts (representable by a few bits) to act upon. From the information processing perspective, this requires moving from syntactic and semantic big data processing to actionable information that can be weaved naturally into human activities and experience.
Exploration of the above research agenda, including powerful use cases, is afforded in a growing number of emerging technologies and their applications - such as chatbots and robotics for healthcare. In this talk, I will provide these examples and share the early progress we have made towards building health chatbots that consume contextually relevant multimodal data and support different forms/modalities of interactions to achieve various alternatives for digital health. I will also demonstrate the strong role of domain knowledge and personalization using domain and personalized knowledge graphs as part of various reasoning and learning techniques.

More Related Content

Similar to Exploiting Multimodal Information for Machine Intelligence and Natural Interactions

Related Books

Free with a 14 day trial from Scribd

See all

Related Audiobooks

Free with a 14 day trial from Scribd

See all

Exploiting Multimodal Information for Machine Intelligence and Natural Interactions

  1. 1. EXPLOITING MULTIMODAL INFORMATION FOR MACHINE INTELLIGENCE AND NATURAL INTERACTIONS Dr. Amit Sheth amit.aiisc.ai Director of AI Institute #AIISC aiisc.ai University of South Carolina International Workshop on Multimedia Applications (IWMA 2021), 3 March 2021 LNM Institute of Information Technology, Jaipur, India.
  2. 2. “ 2 AIISC in core AI areas, and interdisciplinary AI/AI applications
  3. 3. 3 OUTLINE ● Human perception of the real world is multi-modal- that ism our brian seamlessly processes data in the form of various modalities (text, speech, and visual). ● Multimodal information are essential and together, they provide nuances that a single modality can’t. Human communication is intrinsically multimodal--e.g, speech + expression + gestures. ● For a machine to attain intelligence, it requires comprehensive understanding of the environment that it is in. And to develop natural interactions with human, a machine needs to develop understanding of the data it consumes. ● This talk will focus on different data modalities and examples on how a machine (chatbot) can use such information to provide intelligent assistant and natural communication in the health domain. Credit https://aiisc.ai/people Revathy Joey Kaushik
  4. 4. 4 Machine-centric to Human-centric Computing Artificial Intelligence Ambient Intelligence Augmenting Human Intellect Human-Computer Symbiosis Computing for Human Experience Machine-centric Human-centric John McCarthy Mark Weiser Douglas Engelbart Joseph C.R. Licklider Figure: Views along the spectrum of machine-centric to human-centric computing. At the far right is our work on Computing for Human Experience, which explores paradigms such as Semantic, Cognitive, and Perceptual Computing. http://bit.ly/SCP-Magazine AI Institute http://bit.ly/k-Che, http://slidesha.re/k-che
  5. 5. 5 Using Chatbots to Go Beyond Traditional Patient-Doctor Consultation Socio- economic Demo- graphic Family & social Psychologic al Environment Genetic Susceptibilit y Source: Why do people consult the doctor? - Stephen M Campbell and Martin O Roland Decision Making Can voice assistant (chatbot) technology substantially improve monitoring of patient’s conditions and needs? Simple Tasks ● Appointment scheduling ● Information retrieval ● Scripted-automation Complex & Demanding Tasks ● Multimodal input and output ● Natural communication ● Augmented Personalized Health (serving different levels of health needs) Contextualization Personalization Abstraction Different modality of data Images Text Speech Videos IoTs
  6. 6. 6 Source: wired.com and medium VOICE ASSISTANTS
  7. 7. Figure source: https://www.aarp.org/health/conditions-treatments/info-2017/ bronchitis-and-pneumonia-symptoms.html A machine may recognize the picture as “a woman is coughing”. As human, we immediately conjecture and relate to many phenomena with different contexts. Semantic Association (Label picture as coughing) Cognitive (Look at additional background information & interpret in different context, ie: cough vs wheezing cough Perception (Has the patient condition worsen? How well is the patient doing?) Paradigms that Shape Human Experience
  8. 8. AUGMENTED PERSONALIZED HEALTH EXPLOITING MULTIMODAL INFORMATION FOR: SELF-MONITORING - Constant and remote monitoring of disease specific health indicators for any given patient SELF-APPRAISAL - Interpretation of the data collected with respect to disease context for the patient to evaluate themselves SELF-MANAGEMENT- Identify the deviation from normal and assist patients to get back to prescribed care plan INTERVENTION - Change in the care plan - with the converted smart data by APH, provide decision support for treatment adjustments DISEASE PROGRESSION AND TRACKING - Longitudinal data collection and analysis to enhance patients health over the time
  9. 9. “ 9 “The Holy Grail of machine intelligence is the ability to mimic the human brain. However, the human brain’s cognitive and perceptual capability to seamlessly consume, abstract massive amounts of multimodal data, and communicate information challenges the machine intelligence research. Growing number of emerging technologies such as chatbots & robotics present the requirements for these capabilities.”
  10. 10. What is Modality GENERAL A particular mode in which something exists or is experienced or expressed. A particular form of sensory perception: ‘the visual and auditory modalities’. HEALTHCARE MODALITY Modality (medical imaging), a type of equipment used to acquire structural or functional images of the body, such as radiography, ultrasound, nuclear medicine, computed tomography, magnetic resonance imaging and visible light. IN HCI A modality is the classification of a single independent channel of sensory input/output between a computer and a human. Multiple modalities can be used in combination to provide complementary methods that may be redundant but convey information more effectively. 10
  11. 11. 11 Machine Intelligence for Chatbot: Incorporating Multiple Streams & Modalities Figure: Chatbot exploiting multimodal information for machine intelligence and natural interactions From simple informational interface (text, speech) to intelligent assistant
  12. 12. USE CASES & PROTOTYPES Examples on collaborative projects @ AI Institute
  13. 13. 13 Health Related Studies at AI Institute [Overview] Health Challenges (Also Dementia, Obesity, Parkinson’s, Liver Cirrhosis, ADHF) Public Policy/ Population Epidemiology Personalized Health PCS + EMR + Multimodal (Speech + Image) kHealth Asthma in Children Bariatric Surgery Nutrition Physical(IoT)/Cyber/ Social (PCS)+ EMR Marijuana Social Drug Abuse Social Mental Health Depression & Suicide Social + Public + EMR Health Knowledge Graph Services Social + Clinical Data ...and infrastructure technologies: Context-aware KR (SP), KG Development, Smart Data from PCS Big Data, Twitris
  14. 14. 3 Chatbots (Alpha Stage) 1. NOURICH: A Google Assistant based Conversational Nutrition Management System 1. Knowledge-enabled (kHealth) Personalized ChatBot for Asthma: Contextualized & Personalized Conversations involving Multimodal data (IoT & Devices) 1. ReaCTrack: Personalized Adverse Reaction Conversation-based Tracker for Clinical Depression 14 HCI: Applications & Chatbots @ AI Institute kHealth Asthma kHealth Nutrition Mental Health Active (Subset) Healthcare Projects @ KNO.E.SIS with mApps/chatbot kHealth Framework: a knowledge-enabled semantic platform that captures the data and analyzes it to produce actionable information.
  15. 15. 15 Physical-Cyber-Social (PCS) Data Mobile app Q/A (tablet), forced exhaled volume in 1 sec (FEV1), peak expiratory flow (PEF), indoor temperature, indoor humidity, particulate matter, volatile organic compound, carbon dioxide, air quality index, pollen level, outdoor temperature, outdoor humidity, number of steps, heart rate and number of hours of sleep. Also clinical notes. kHealth Asthma Nutrition Mental Health Active Healthcare Projects at AI Inst. (Subset) Modality of Data For monitoring asthma control and predict vulnerability Q/A, diet, food profile, food images, nutrition knowledge bases, user knowledge graph. For nutrition tracking and diet monitoring Modeling Social Behavior for Healthcare Utilization in Mental Health Q/A, social media profile (Twitter, Reddit).
  16. 16. 16 Modalities in Select mApps
  17. 17. 17 Use Case 1: ASTHMA Many Sources of Highly Diverse Data (& collection methods: Active + Passive): Up to 1852 data points/ patient /day kBot with screen interface for conversation Images Text Speech *(Asthma-Obesity) ★ Episodic to Continuous Monitoring ★ Clinician-centric to Patient-centric ★ Clinician controlled to Patient-empowered ★ Disease Focused to Wellness-focused ★ Sparse data to Multimodal Big Data
  18. 18. Data Collection >150 patients 29 parameters 1852 data points per patient per day 63% kit compliance ● Data Collection: Since Dec 2016 ● Active sensing: 18 data points/day (Peak flow meter and Tablet) ● Passive sensing: 1834 data points/ day (Foobot, Fitbit, Outdoor environmental data) 5-17 years of age 1 or 3 months of monitoring 18
  19. 19. 19 Utkarshani Jaimini, Krishnaprasad Thirunarayan, Maninder Kalra, Revathy Venkataramanan, Dipesh Kadariya, Amit Sheth, “How Is My Child’s Asthma?” Digital Phenotype and Actionable Insights for Pediatric Asthma”, JMIR Pediatr Parent 2018;1(2):e11988, DOI: 10.2196/11988.
  20. 20. Revathy Venkataramanan, Krishnaprasad Thirunarayan, Utkarshani Jaimini, Dipesh Kadariya, Hong Yung Yip, Maninder Kalra, Amit Sheth, “Determination of Personalized Asthma Triggers from Multimodal Sensing and a Mobile app”, JMIR Pediatr Parent 2018;1(2):e11988, DOI: 10.2196/11988.
  21. 21. 21 Use Case 2: NOURICH (diet management chatbot)
  22. 22. Data sources - User specific (food allergies, comorbidities, lab reports including genetic profiles and etc) - Healthy/must-eat food- specific Personalized Knowledge Base - Meal name - Nutrition - Ingredients - Cooking style Data collected Knowledge graph for domain knowledge Image Voice Text Inputs Processing engine Image, voice,text to keyword DASHBOARD - Personalized diet score (added sugar by diabetic/non-diabetic) - Calorie and constituents - Food trend - Weight trend NUTRITION CHATBOT
  23. 23. 23 Chatbots for Healthcare KNO.E.SIS Overview
  24. 24. 24 Use Case 3: kBot Elder Care Intelligent Assistant to ask elderly with Heart Failure (HF), Chronic Obstructive Pulmonary Disease (COPD) or Type 2 Diabetes Mellitus (T2DM).
  25. 25. Use Case 4: Disaster Management
  26. 26. “ To support the corresponding (chatbots) data analysis and reasoning needs, we have to explore a pedagogical framework consisting of Semantic computing, Cognitive computing, and Perceptual computing This requires moving from syntactic and semantic big data processing to actionable information that can be weaved naturally into human activities and experience. 26
  27. 27. SEMANTIC-COGNITIVE- PERCEPTUAL COMPUTING Knowledge-Infused AI with Contextualization (Knowledge Graphs), Personalization & Abstraction
  28. 28. 28 Semantic Browsing Extraction Data Integration and Interlinking Entity Complex Extraction Aberrant Drug- related Behaviour Neuro-Cognitive Symptoms Adverse Drug Reaction Relation Event Severity Personal Sensor Data De-identified EMR Blog Post Context Representation Relevant Subgraph Selection Semantic Search Disease-specific Chatbot Visualization Health Knowledge Graph Intent Open Health Knowledge Graph
  29. 29. 29 SOCIAL -MEDIA TEXT (July 12,2016) EVENT-SPECIFIC SCHEMA-BASED KNOWLEDGE
  30. 30. 30 Evolving Patient Health Knowledge Graph (PHKG) Figure: A healthcare assistant bot interacts with the patient via various conversational interfaces (voice, text, and visual) to acquire and disseminate information, and provide recommendation (validated by physician). The core functionalities of the chatbot (Component C boxed in blue) are extended with a background HKG (Component A boxed in green) and a evolving PKG (Component B boxed in orange). ★ Smarter & engaging agent ★ Minimize active sensing (Questions to be asked) ★ Ask only informed & intelligent questions ★ Relevant & Contextualized conversations ★ Personalized & Human-Like
  31. 31. 31 ONE SLIDE TO SHOW HOW PHKG EVOLVES OVER TIME AI Inst Alchemy API KHealth Project (IoT) datasets (e.g., asthma, obesity, Parkinson) Reasoning mechanisms Enriching KG Enriching KG In-built rule-based inference engine Machine Learning Updating the KG with more triples Analyzing datasets Executing reasoning Ontology Catalogs: ● BioPortal ● Linked Open Vocabularies (LOV) ● Linked Open Vocabularies for Internet of Things (LOV4IoT) Linked Open Data (LOD): ● UMLS ● SNOMED-CT ● ICD-10 ● Clinical Trials ● Sider Personalized Health Knowledge Graph (PHKG) Personal Sensor Data Electronic Medical Records (EMR) Figure: How a PHKG evolves with multimodal information
  32. 32. GENERIC CHATBOT VS INTELLIGENT CHATBOT Needed for Machine Intelligence and Natural Interactions: Contextualization, Personalization, and Abstraction
  33. 33. 33 Contextualization and Personalization kBOT initiates greeting conversation. Understands the patient’s health condition (allergic reaction to high ragweed pollen level) via the personalized patient’s knowledge graph generated from EMR, PGHD, and prior interactions with the kBot. Generates predictions or recommended course of actions. Inference based on patient’s historical records and background health knowledge graph containing contextualized (domain-specific) knowledge. Figure: Example kBot conversation which utilizes background health knowledge graph and patient’s knowledge graph to infer and generate recommendation to patients. ★ Conversing only information relevant to the patient Context enabled by relevant healthcare knowledge including clinical protocols.
  34. 34. 34 Contextualization refers to data interpretation in terms of knowledge (context). Without Domain Knowledge With Domain Knowledge Chatbot with domain (drug) knowledge is potentially more natural and able to deal with variations.
  35. 35. 35 Personalization refers to future course of action by taking into account the contextual factors such as user’s health history, physical characteristics, environmental factors, activity, and lifestyle. Without Contextualized Personalization With Contextualized Personalization Chatbot with contextualized (asthma) knowledge is potentially more personalized and engaging.
  36. 36. 36 Abstraction A computational technique that maps and associates raw data to action-related information. With Abstraction Without Abstraction .
  37. 37. 37 Smarter Chatbot with Semantically-Abstracted Information Smarter data Data Sophistication Smart (semantically-abstracted) data should answer: ★ What causes my disease severity? ★ How well am I doing with respect to prescribed care plan? ★ Am I deviating from the care plan? I am following the care plan but my disease is not well controlled. ★ Do I need treatment adjustments? ★ How well controlled is my disease over time? Example of Abstraction
  38. 38. 38 Semantic, Cognitive, Perceptual Computing: Paradigms That Shape Human Experience http://bit.ly/SCPComputing Humans are interested in high-level concepts (phenotypic characteristics). Semantic Computing: Assign labels and associate meanings (representation & contextualization). Cognitive Computing: Interpretation of data with respect to perspectives, constraints, domain knowledge, and personal context. Perceptual Computing: A cyclical process of semantic-cognitive computing for higher level of perception and reasoning (abstraction & action).
  39. 39. Knowledge -Infused Learning with Semantic, Cognitive, Perceptual Computing Framework 39 THE BABY STEPS: MACHINE / DEEP LEARNING INFUSED WITH PERSONALIZED HEALTH KNOWLEDGE GRAPH Knowledge Domain (Ontology) Personalized HKG Multisensory Sensing & Multimodal Data Interactions Images Text Speech Videos IoTs Natural Language Processing, Machine with Deep Learning AUGMENTED PERSONALIZED HEALTH (APH) Modeling broader disease context, and personalized user behavior Reasoning & decision- making framework To achieve ABSTRACTION and minimize data overload, assist in making choices, appraisal, recommendations
  40. 40. Use case: Personalized Health Agent using KiRL for Mental Health Self-Management
  41. 41. Lives in Los Angeles From Denver Moves between/high freq family1 Lives In has Expert designed Schema for PKG: lives(Patient, ?) has_family(Patient, Family,?) family_location(Patient, Family, ?) visit_frequency(Patient, Family, ?) + Relational facts from the PKG lives(patient1,” Los Angeles”) has_family(patient1, family1, “True”) family_location(patient1, family1, “Denver”) visit_frequency(patient1, family1, “high”) patient1 Knowledge Infused Reinforcement Learning: Knowledge + Patient context + Patient feedback Depression sadness Suffering Context Knowledge ➢ a) Reminding-clarification, ➢ b) Information-gathering, ➢ c) Appraisal, ➢ d) Symptom check, ➢ e) Facilitate communication with health- care provider/ Connect to professional Caption: The relational context is derived from the PKG along with the schema, from which, in combination with the patients feedback and domain knowledge, the Knowledge Infused Reinforcement Learning algorithm outputs a high level recommendation.
  42. 42. 45 In short, ❖ Multimodal information are essential and can be exploited for machine intelligence and natural interactions. ❖ Knowledge-infused learning could give us the power need to match complex requirements. ❖ Semantic-Cognitive-Perceptual Computing enables contextualization, personalization, and abstraction for Augmented Personalized Health.
  43. 43. 46 5 faculty, >12 PhDs, few Masters, >5 undergrads, 2 Post-Docs, >10 Research Interns Alumni in/as Industry: IBM T.J. Watson, Almaden, Amazon, Samsung America, LinkedIn, Facebook, Bosch Start-ups: AppZen, AnalyticsFox, Cognovi Labs Faculty: George Mason, University of Kentucky, Case Western Reserve, North Carolina State University, University of Dayton Core AI Neuro-symbolic computing/Hybrid AI, Knowledge Graph Development, Deep Learning, Reinforcement Learning, Natural Language Processing, Knowledge- infused Learning (for deep learning and NLP), Multimodal AI (including IoT/sensor data streams, images), Collaborative Assistants, Multiagent Systems (incl. Coordinating systems of decision making agents including humans, robots, sensors), Semantic-Cognitive-Perceptual Computing, Brain- inspired computing, Interpretation/Explainability/Trust/Ethics in AI systems, Search, Gaming Interdisciplinary AI and application domains: Medicine/Clinical, Biomedicine, Social Good/Harm, Public Health (mental health, addiction), Education, Manufacturing, Disaster Management

Editor's Notes

  • Slide 3: Inner circle : talks about our research areas and strength
  • Convey from simple tasks to complicated, it is not simple, there are many issues: data, different modality, context, personalization
  • Growing ecosystem of chatbot
    Chatbot as intermediary patient <-> doctor
    Take an example of elderly care, rather than serving as just a basic voice interface, a chatbot should consume (like human)
    different streams and modalities of data, textual data, voice & speech data, images, and background knowledge of the patient to be able to assist intelligently for an elderly.
  • JMIR Paper
  • voice by libertetstudio from the Noun Project
    text by Vectorstall from the Noun Project

    Dye info -
    Doritos
    https://ndb.nal.usda.gov/ndb/foods/show/45366963?fgcd=&manu=&format=&count=&max=25&offset=&sort=default&order=asc&qlookup=doritos&ds=&qt=&qp=&qa=&qn=&q=&ing=
    Vanilla frosting - https://ndb.nal.usda.gov/ndb/foods/show/45122774?fgcd=&manu=&format=&count=&max=25&offset=&sort=default&order=asc&qlookup=DUNCAN+HINES%2C+WHIPPED+FROSTING%2C+VANILLA%2C+UPC%3A+644209405923&ds=&qt=&qp=&qa=&qn=&q=&ing=
  • Step 1: Personalized information from clinician visit in the discharge summary and target expert designed initial set of questions, compiled into a personalized knowledge graph stored on a cloud.
  • Step 2: The Knowledge from the PKG stored in the cloud, infused into the RL method to predict high-level chatbot tasks. Cloud monitored for safety by the clinician. Patient’s answers/feedback that act as rewards.
  • Step 3: The high-level task is used to generate dialogue with the patient and updates to the PKG are appropriately made and this process continues during the length of their interactions.
  • ×