"Multimodal Question Answering in the Medical Domain". Invited talk at the Language Technologies Institute (LTI), Carnegie Mellon University (CMU).
Dr. Asma Ben Abacha.
April 24, 2020.
This document provides an overview of transformers in computer vision. It discusses how transformers were originally developed for natural language processing using attention mechanisms instead of recurrent connections. Vision transformers apply this approach to images by treating patches as tokens and using self-attention. Early vision transformers achieved strong results on image classification tasks. Recent developments include Swin transformers which use shifted windows to incorporate positional information, and models that combine convolutional and transformer architectures. Transformers are also being applied to video understanding tasks. The document explores different transformer architectures and applications of vision transformers.
(1) YOLO frames object detection as a single regression problem to predict bounding boxes and class probabilities directly from full images in one step. (2) It resizes images as input to a convolutional network that outputs a grid of predictions with bounding box coordinates, confidence, and class probabilities. (3) YOLO achieves real-time speeds while maintaining high average precision compared to other detection systems, with most errors coming from inaccurate localization rather than predicting background or other classes.
Data Con LA 2022 - Transformers for NLPData Con LA
Ash Pahwa, Instructor at CalTech
Transformer architecture was proposed by Google Brain in 2017 to process sequential data. Transformers can be used in Natural Language Processing (NLP) and Computer Vision applications. Transformer architecture is based on the concept of ‘Self-Attention’. Transformers replaced the RNN/LSTM architecture. The major advantages of Transformer architecture are that they are fast and bi-directional. The input text is fed into this architecture in parallel which allows faster processing. The leading Language models BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pre-trained Transformer), are built upon Transformer architecture. BERT was proposed by Google and GPT-1/2/3 was proposed by OpenAI. BERT Language Model is included in Google Search Engine. HuggingFace web portal provides many popular Transformers in different flavors. Transformer can be used for all Natural Language Processing (NLP) applications like sentiment analysis, translation, auto-completion, named entity recognition, automatic question- answering and many more. Transformers can also be used for generating artificial text, which is indistinguishable from text generated by humans. This talk will briefly cover the theory of Transformers. Next it will focus on how to fine tune the standard Transformer library (downloaded from Hugging Face portal) for a specific application.
This document provides guidance on how to structure prompts to get the desired output from AI image generators. It recommends asking yourself questions about the type of image wanted, such as if it's a photo or painting, the subject, desired details, art style, and more. These answers can then be combined into a prompt sentence with keywords separated by commas to help steer the generation. Ordering and presentation of concepts is important to get the intended results.
The document describes the architecture of 4 YOLOv5 object detection models of different sizes - small, medium, large, and extra large. Each model uses the same basic building blocks of focus, convolutional, and bottleneck CSP layers followed by upsampling and concatenation, but with different input channel sizes and numbers of layers to process images of different resolutions.
A presentation about the development of the ideas from the autoencoder to the Stable Diffusion text-to-image model.
Models covered: autoencoder, VAE, VQ-VAE, VQ-GAN, latent diffusion, and stable diffusion.
This document provides an overview of transformers in computer vision. It discusses how transformers were originally developed for natural language processing using attention mechanisms instead of recurrent connections. Vision transformers apply this approach to images by treating patches as tokens and using self-attention. Early vision transformers achieved strong results on image classification tasks. Recent developments include Swin transformers which use shifted windows to incorporate positional information, and models that combine convolutional and transformer architectures. Transformers are also being applied to video understanding tasks. The document explores different transformer architectures and applications of vision transformers.
(1) YOLO frames object detection as a single regression problem to predict bounding boxes and class probabilities directly from full images in one step. (2) It resizes images as input to a convolutional network that outputs a grid of predictions with bounding box coordinates, confidence, and class probabilities. (3) YOLO achieves real-time speeds while maintaining high average precision compared to other detection systems, with most errors coming from inaccurate localization rather than predicting background or other classes.
Data Con LA 2022 - Transformers for NLPData Con LA
Ash Pahwa, Instructor at CalTech
Transformer architecture was proposed by Google Brain in 2017 to process sequential data. Transformers can be used in Natural Language Processing (NLP) and Computer Vision applications. Transformer architecture is based on the concept of ‘Self-Attention’. Transformers replaced the RNN/LSTM architecture. The major advantages of Transformer architecture are that they are fast and bi-directional. The input text is fed into this architecture in parallel which allows faster processing. The leading Language models BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pre-trained Transformer), are built upon Transformer architecture. BERT was proposed by Google and GPT-1/2/3 was proposed by OpenAI. BERT Language Model is included in Google Search Engine. HuggingFace web portal provides many popular Transformers in different flavors. Transformer can be used for all Natural Language Processing (NLP) applications like sentiment analysis, translation, auto-completion, named entity recognition, automatic question- answering and many more. Transformers can also be used for generating artificial text, which is indistinguishable from text generated by humans. This talk will briefly cover the theory of Transformers. Next it will focus on how to fine tune the standard Transformer library (downloaded from Hugging Face portal) for a specific application.
This document provides guidance on how to structure prompts to get the desired output from AI image generators. It recommends asking yourself questions about the type of image wanted, such as if it's a photo or painting, the subject, desired details, art style, and more. These answers can then be combined into a prompt sentence with keywords separated by commas to help steer the generation. Ordering and presentation of concepts is important to get the intended results.
The document describes the architecture of 4 YOLOv5 object detection models of different sizes - small, medium, large, and extra large. Each model uses the same basic building blocks of focus, convolutional, and bottleneck CSP layers followed by upsampling and concatenation, but with different input channel sizes and numbers of layers to process images of different resolutions.
A presentation about the development of the ideas from the autoencoder to the Stable Diffusion text-to-image model.
Models covered: autoencoder, VAE, VQ-VAE, VQ-GAN, latent diffusion, and stable diffusion.
[PR12] You Only Look Once (YOLO): Unified Real-Time Object DetectionTaegyun Jeon
The document summarizes the You Only Look Once (YOLO) object detection method. YOLO frames object detection as a single regression problem to directly predict bounding boxes and class probabilities from full images in one pass. This allows for extremely fast detection speeds of 45 frames per second. YOLO uses a feedforward convolutional neural network to apply a single neural network to the full image. This allows it to leverage contextual information and makes predictions about bounding boxes and class probabilities for all classes with one network.
This document discusses research on human action recognition using skeleton data. It introduces issues with skeleton-based action recognition, such as variable scales, view orientations, noise and rate/intra-action variations. It then reviews previous work on skeleton-based action recognition using hand-crafted features and deep learning models. The document proposes two ensemble deep learning models called Ensemble TS-LSTM v1 and v2 that use temporal sliding LSTMs to capture short, medium and long-term dependencies from skeleton sequences for action recognition. Experimental results on standard datasets demonstrate the models outperform previous methods.
This document discusses session-based recommender systems. It begins by explaining why recommender systems are useful given the large number of choices people face online. It then defines recommender systems as tools that find relevant items for users based on feedback. The document outlines different types of feedback and techniques for building recommendations, including matrix factorization, item-to-item recommendations, and recurrent neural networks like GRU4Rec. It concludes by noting that session-based recommender systems are common in practice and the GRU4Rec code is openly available.
https://imatge.upc.edu/web/publications/visual-question-answering-20
This bachelor's thesis explores dierent deep learning techniques to solve the Visual Question-Answering (VQA) task, whose aim is to answer questions about images. We study dierent Convolutional Neural Networks (CNN) to extract the visual representation from images: Kernelized-CNN (KCNN), VGG-16 and Residual Networks (ResNet). We also analyze the impact of using pre-computed word embeddings trained in large datasets (GloVe embeddings). Moreover, we examine dierent techniques of joining representations from dierent modalities. This work has been submitted to the second edition Visual Question Answering Challenge, and obtained a 43.48% of accuracy.
[Video recording available at https://www.youtube.com/playlist?list=PLewjn-vrZ7d3x0M4Uu_57oaJPRXkiS221]
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we present an overview of model interpretability and explainability in AI, key regulations / laws, and techniques / tools for providing explainability as part of AI/ML systems. Then, we focus on the application of explainability techniques in industry, wherein we present practical challenges / guidelines for effectively using explainability techniques and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We present case studies across different companies, spanning application domains such as search & recommendation systems, hiring, sales, and lending. Finally, based on our experiences in industry, we identify open problems and research directions for the data mining / machine learning community.
Word embedding, Vector space model, language modelling, Neural language model, Word2Vec, GloVe, Fasttext, ELMo, BERT, distilBER, roBERTa, sBERT, Transformer, Attention
This document summarizes Bumsoo Kim's presentation on deep convolutional generative adversarial networks (DCGANs) for unsupervised representation learning. The presentation introduces generative models, describes the DCGAN model architecture which uses an adversarial process between a generator and discriminator, and discusses evaluating and applying vector arithmetic to generated images.
The Data Platform for Today’s Intelligent ApplicationsNeo4j
The document discusses how graph technology and Neo4j's graph data platform are fueling data-driven transformations across industries by unlocking deeper insights from relationships within data. It notes that 75% of Fortune 1000 companies had suppliers impacted by the pandemic showing supply chain problems are really data problems. It then promotes Neo4j as the leader in the growing graph database market and discusses its capabilities and customers across industries like insurance, banking, automotive, retail, and telecommunications.
In this video from the MIT Deep Learning Series, Lex Fridman presents: Deep Learning State of the Art (2020).
"This lecture is on the most recent research and developments in deep learning, and hopes for 2020. This is not intended to be a list of SOTA benchmark results, but rather a set of highlights of machine learning and AI innovations and progress in academia, industry, and society in general. This lecture is part of the MIT Deep Learning Lecture Series."
Watch the video: https://wp.me/p3RLHQ-lng
Learn more: https://deeplearning.mit.edu/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
http://imatge-upc.github.io/vqa-2016-cvprw/
This thesis studies methods to solve Visual Question-Answering (VQA) tasks with a Deep Learning framework.As a preliminary step, we explore Long Short-Term Memory (LSTM) networks used in Natural Language Processing (NLP) to tackle Question-Answering (text based). We then modify the previous model to accept an image as an input in addition to the question. For this purpose, we explore the VGG-16 and K-CNN convolutional neural networks to extract visual features from the image. These are merged with the word embedding or with a sentence embedding of the question to predict the answer. This work was successfully submitted to the Visual Question Answering Challenge 2016, where it achieved a 53,62\% of accuracy in the test dataset. The developed software has followed the best programming practices and Python code style, providing a consistent baseline in Keras for different configurations.
Talk with Yves Raimond at the GPU Tech Conference on Marth 28, 2018 in San Jose, CA.
Abstract:
In this talk, we will survey how Deep Learning methods can be applied to personalization and recommendations. We will cover why standard Deep Learning approaches don't perform better than typical collaborative filtering techniques. Then we will survey we will go over recently published research at the intersection of Deep Learning and recommender systems, looking at how they integrate new types of data, explore new models, or change the recommendation problem statement. We will also highlight some of the ways that neural networks are used at Netflix and how we can use GPUs to train recommender systems. Finally, we will highlight promising new directions in this space.
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
Olga Petrova gives an introduction to transformers for natural language processing (NLP). She begins with an overview of representing words using tokenization, word embeddings, and one-hot encodings. Recurrent neural networks (RNNs) are discussed as they are important for modeling sequential data like text, but they struggle with long-term dependencies. Attention mechanisms were developed to address this by allowing the model to focus on relevant parts of the input. Transformers use self-attention and have achieved state-of-the-art results in many NLP tasks. Bidirectional Encoder Representations from Transformers (BERT) provides contextualized word embeddings trained on large corpora.
YOLO (You Only Look Once) is a real-time object detection system that frames object detection as a regression problem. It uses a single neural network that predicts bounding boxes and class probabilities directly from full images in one evaluation. This approach allows YOLO to process images and perform object detection over 45 frames per second while maintaining high accuracy compared to previous systems. YOLO was trained on natural images from PASCAL VOC and can generalize to new domains like artwork without significant degradation in performance, unlike other methods that struggle with domain shift.
This presentation covers the following topics-
1. Video Classification as a sequence of frames
2. Video Classification as a sequence of frame-blocks
3. 2D ConvNets for Videos
4. CNN + LSTM
The RAPIDS suite of software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
Describing latest research in visual reasoning, in particular visual question answering. Covering both images and videos. Dual-process theories approach. Relational memory.
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfPo-Chuan Chen
The document describes the RAG (Retrieval-Augmented Generation) model for knowledge-intensive NLP tasks. RAG combines a pre-trained language generator (BART) with a dense passage retriever (DPR) to retrieve and incorporate relevant knowledge from Wikipedia. RAG achieves state-of-the-art results on open-domain question answering, abstractive question answering, and fact verification by leveraging both parametric knowledge from the generator and non-parametric knowledge retrieved from Wikipedia. The retrieved knowledge can also be updated without retraining the model.
What you need to know about Generative AI and Data Management?Denodo
Watch full webinar here: https://buff.ly/3UXy0A2
It should be no surprise that Generative AI will have a profound impact to data management in years to come. Much like other areas of the technology sector, the opportunities presented by GenAI will accelerate our efforts around all aspects of data management, including self-service, automation, data governance and security. On the other hand, it is also becoming clearer that to unleash the true potential of AI assistants powered by GenAI, we need novel implementation strategies and a reimagined data architecture. This presents an exhilarating yet challenging future, demanding innovative thinking and methodologies in data management.
Join us on this webinar to learn about:
- The opportunities and challenges presented by GenAI today.
- Exploiting GenAI to democratize data management.
- How to augment GenAI applications with corporate data and knowledge.
- How to get started.
Medical Question Answering: Dealing with the complexity and specificity of co...Asma Ben Abacha
This document provides an overview of Dr. Asma Ben Abacha's work on medical question answering. It discusses the challenges of consumer health question answering, including analyzing and understanding complex questions. It also discusses challenges in retrieving relevant and reliable answers. The document outlines Dr. Ben Abacha's research projects on recognizing question entailment to answer new questions, summarizing consumer health questions, answering medication questions using a new dataset, and visual question answering using medical images. It provides results of experiments on these tasks and discusses insights and needed solutions for advancing medical question answering.
[PR12] You Only Look Once (YOLO): Unified Real-Time Object DetectionTaegyun Jeon
The document summarizes the You Only Look Once (YOLO) object detection method. YOLO frames object detection as a single regression problem to directly predict bounding boxes and class probabilities from full images in one pass. This allows for extremely fast detection speeds of 45 frames per second. YOLO uses a feedforward convolutional neural network to apply a single neural network to the full image. This allows it to leverage contextual information and makes predictions about bounding boxes and class probabilities for all classes with one network.
This document discusses research on human action recognition using skeleton data. It introduces issues with skeleton-based action recognition, such as variable scales, view orientations, noise and rate/intra-action variations. It then reviews previous work on skeleton-based action recognition using hand-crafted features and deep learning models. The document proposes two ensemble deep learning models called Ensemble TS-LSTM v1 and v2 that use temporal sliding LSTMs to capture short, medium and long-term dependencies from skeleton sequences for action recognition. Experimental results on standard datasets demonstrate the models outperform previous methods.
This document discusses session-based recommender systems. It begins by explaining why recommender systems are useful given the large number of choices people face online. It then defines recommender systems as tools that find relevant items for users based on feedback. The document outlines different types of feedback and techniques for building recommendations, including matrix factorization, item-to-item recommendations, and recurrent neural networks like GRU4Rec. It concludes by noting that session-based recommender systems are common in practice and the GRU4Rec code is openly available.
https://imatge.upc.edu/web/publications/visual-question-answering-20
This bachelor's thesis explores dierent deep learning techniques to solve the Visual Question-Answering (VQA) task, whose aim is to answer questions about images. We study dierent Convolutional Neural Networks (CNN) to extract the visual representation from images: Kernelized-CNN (KCNN), VGG-16 and Residual Networks (ResNet). We also analyze the impact of using pre-computed word embeddings trained in large datasets (GloVe embeddings). Moreover, we examine dierent techniques of joining representations from dierent modalities. This work has been submitted to the second edition Visual Question Answering Challenge, and obtained a 43.48% of accuracy.
[Video recording available at https://www.youtube.com/playlist?list=PLewjn-vrZ7d3x0M4Uu_57oaJPRXkiS221]
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we present an overview of model interpretability and explainability in AI, key regulations / laws, and techniques / tools for providing explainability as part of AI/ML systems. Then, we focus on the application of explainability techniques in industry, wherein we present practical challenges / guidelines for effectively using explainability techniques and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We present case studies across different companies, spanning application domains such as search & recommendation systems, hiring, sales, and lending. Finally, based on our experiences in industry, we identify open problems and research directions for the data mining / machine learning community.
Word embedding, Vector space model, language modelling, Neural language model, Word2Vec, GloVe, Fasttext, ELMo, BERT, distilBER, roBERTa, sBERT, Transformer, Attention
This document summarizes Bumsoo Kim's presentation on deep convolutional generative adversarial networks (DCGANs) for unsupervised representation learning. The presentation introduces generative models, describes the DCGAN model architecture which uses an adversarial process between a generator and discriminator, and discusses evaluating and applying vector arithmetic to generated images.
The Data Platform for Today’s Intelligent ApplicationsNeo4j
The document discusses how graph technology and Neo4j's graph data platform are fueling data-driven transformations across industries by unlocking deeper insights from relationships within data. It notes that 75% of Fortune 1000 companies had suppliers impacted by the pandemic showing supply chain problems are really data problems. It then promotes Neo4j as the leader in the growing graph database market and discusses its capabilities and customers across industries like insurance, banking, automotive, retail, and telecommunications.
In this video from the MIT Deep Learning Series, Lex Fridman presents: Deep Learning State of the Art (2020).
"This lecture is on the most recent research and developments in deep learning, and hopes for 2020. This is not intended to be a list of SOTA benchmark results, but rather a set of highlights of machine learning and AI innovations and progress in academia, industry, and society in general. This lecture is part of the MIT Deep Learning Lecture Series."
Watch the video: https://wp.me/p3RLHQ-lng
Learn more: https://deeplearning.mit.edu/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
http://imatge-upc.github.io/vqa-2016-cvprw/
This thesis studies methods to solve Visual Question-Answering (VQA) tasks with a Deep Learning framework.As a preliminary step, we explore Long Short-Term Memory (LSTM) networks used in Natural Language Processing (NLP) to tackle Question-Answering (text based). We then modify the previous model to accept an image as an input in addition to the question. For this purpose, we explore the VGG-16 and K-CNN convolutional neural networks to extract visual features from the image. These are merged with the word embedding or with a sentence embedding of the question to predict the answer. This work was successfully submitted to the Visual Question Answering Challenge 2016, where it achieved a 53,62\% of accuracy in the test dataset. The developed software has followed the best programming practices and Python code style, providing a consistent baseline in Keras for different configurations.
Talk with Yves Raimond at the GPU Tech Conference on Marth 28, 2018 in San Jose, CA.
Abstract:
In this talk, we will survey how Deep Learning methods can be applied to personalization and recommendations. We will cover why standard Deep Learning approaches don't perform better than typical collaborative filtering techniques. Then we will survey we will go over recently published research at the intersection of Deep Learning and recommender systems, looking at how they integrate new types of data, explore new models, or change the recommendation problem statement. We will also highlight some of the ways that neural networks are used at Netflix and how we can use GPUs to train recommender systems. Finally, we will highlight promising new directions in this space.
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
Olga Petrova gives an introduction to transformers for natural language processing (NLP). She begins with an overview of representing words using tokenization, word embeddings, and one-hot encodings. Recurrent neural networks (RNNs) are discussed as they are important for modeling sequential data like text, but they struggle with long-term dependencies. Attention mechanisms were developed to address this by allowing the model to focus on relevant parts of the input. Transformers use self-attention and have achieved state-of-the-art results in many NLP tasks. Bidirectional Encoder Representations from Transformers (BERT) provides contextualized word embeddings trained on large corpora.
YOLO (You Only Look Once) is a real-time object detection system that frames object detection as a regression problem. It uses a single neural network that predicts bounding boxes and class probabilities directly from full images in one evaluation. This approach allows YOLO to process images and perform object detection over 45 frames per second while maintaining high accuracy compared to previous systems. YOLO was trained on natural images from PASCAL VOC and can generalize to new domains like artwork without significant degradation in performance, unlike other methods that struggle with domain shift.
This presentation covers the following topics-
1. Video Classification as a sequence of frames
2. Video Classification as a sequence of frame-blocks
3. 2D ConvNets for Videos
4. CNN + LSTM
The RAPIDS suite of software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
Describing latest research in visual reasoning, in particular visual question answering. Covering both images and videos. Dual-process theories approach. Relational memory.
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfPo-Chuan Chen
The document describes the RAG (Retrieval-Augmented Generation) model for knowledge-intensive NLP tasks. RAG combines a pre-trained language generator (BART) with a dense passage retriever (DPR) to retrieve and incorporate relevant knowledge from Wikipedia. RAG achieves state-of-the-art results on open-domain question answering, abstractive question answering, and fact verification by leveraging both parametric knowledge from the generator and non-parametric knowledge retrieved from Wikipedia. The retrieved knowledge can also be updated without retraining the model.
What you need to know about Generative AI and Data Management?Denodo
Watch full webinar here: https://buff.ly/3UXy0A2
It should be no surprise that Generative AI will have a profound impact to data management in years to come. Much like other areas of the technology sector, the opportunities presented by GenAI will accelerate our efforts around all aspects of data management, including self-service, automation, data governance and security. On the other hand, it is also becoming clearer that to unleash the true potential of AI assistants powered by GenAI, we need novel implementation strategies and a reimagined data architecture. This presents an exhilarating yet challenging future, demanding innovative thinking and methodologies in data management.
Join us on this webinar to learn about:
- The opportunities and challenges presented by GenAI today.
- Exploiting GenAI to democratize data management.
- How to augment GenAI applications with corporate data and knowledge.
- How to get started.
Medical Question Answering: Dealing with the complexity and specificity of co...Asma Ben Abacha
This document provides an overview of Dr. Asma Ben Abacha's work on medical question answering. It discusses the challenges of consumer health question answering, including analyzing and understanding complex questions. It also discusses challenges in retrieving relevant and reliable answers. The document outlines Dr. Ben Abacha's research projects on recognizing question entailment to answer new questions, summarizing consumer health questions, answering medication questions using a new dataset, and visual question answering using medical images. It provides results of experiments on these tasks and discusses insights and needed solutions for advancing medical question answering.
Insights from the Organization of International Challenges on Artificial Inte...Asma Ben Abacha
"Insights from the Organization of International Challenges on Artificial Intelligence in Medical Question Answering". Invited talk at the SciNLP (Natural Language Processing and Data Mining for Scientific Text) Workshop.
Dr. Asma Ben Abacha.
June 24, 2020.
Big Data Means Big Potential Challenges for Nurse Execs Response.pdfbkbk37
Big data has the potential to improve patient outcomes through more efficient access to personalized health information. However, it also poses significant risks, such as privacy breaches during cyberattacks. One strategy is to fragment big data across secure legacy systems to mitigate these risks, while still allowing analysis. Continuous data cleansing is also important. Overall, big data technologies could enhance clinical decision making if challenges around security and incompatible data formats are properly addressed.
This document provides information about the HST.921 course on Information Technology in the Healthcare System of the Future offered at Harvard and MIT in spring 2009. The course aims to empower students to critically analyze current or future healthcare problems and develop novel IT solutions. It includes weekly lectures, tutorials/labs, and a group project. Students work in multidisciplinary teams on design, business, marketing, or clinical trial tracks. Past projects addressed topics like social media, serious games, telehealth, and disease management technologies. The course is open to students from various Harvard and MIT programs for credit.
This doctoral dissertation defense document outlines Vincent Bridges' dissertation on evaluating the effectiveness of medical assistant programs at three Midwestern schools. The document includes an introduction, problem statements, literature review themes, research questions, methodology, and findings structure. Bridges evaluated how the programs meet stakeholder needs and what changes could better meet needs. Key findings included areas of enhanced critical thinking, phlebotomy practice, microbiology laboratory components, and expanded duties like panel management. Recommendations focused on enhancing curriculum based on stakeholder feedback.
IRJET- Clinical Medical Knowledge Extraction using Crowdsourcing TechniquesIRJET Journal
This document proposes a system to extract medical knowledge from crowdsourced question answering websites using truth discovery techniques. It aims to determine the trustworthiness of answers provided by doctors on these sites. The system performs medical term extraction from user queries and responses using stemming. It then calculates answer trustworthiness based on factors like doctor expertise, ethnicity, and commitment level. The highest trustworthiness responses are selected as best opinions and stored in a medical blog. A chatbot is also developed to predict disease based on user-reported symptoms. The system aims to provide users with reliable medical information and diagnoses from these crowdsourced sites.
The document summarizes a study that evaluated the acceptability of a personally controlled health record (PCHR) system called Indivo in a community-based setting. Over 300 participants were involved in formative research activities to understand awareness, beliefs and reactions. The study found moderate awareness of privacy issues and high support for patient autonomy. Results informed guidelines on design improvements, literacy tools, and safety protocols for PCHR systems. Limitations included a lack of detail on methodology and sample selection.
The document summarizes the process of implementing an integrated electronic health record (IEHR) system in a university exercise physiology teaching clinic. Key steps included procuring practice management software with IEHR capabilities, developing condition-specific protocols, designing clinical interfaces, and configuring the system for data entry via questionnaires and during consultations. Interviews found that staff and students perceived more advantages than disadvantages to adopting IEHR, such as improved patient care, progress tracking, and the ability to engage in research. The new system aims to enhance student learning and patient outcomes by allowing access to health information and progress data.
The document summarizes new data on clinician learning from three recent studies. The studies looked at clinician use of social media for learning, preferences for continuing medical education (CME), and natural learning actions. The data showed varied use of social media among clinicians, a preference for online and virtual CME courses over live meetings, and that clinicians engage in note-taking, reminders, searching for information, and social learning but often lack effective systems to support these actions. The document proposes discussing how this new understanding of clinician learning could impact educational programs.
System for Recommending Drugs Based on Machine Learning Sentiment Analysis of...IRJET Journal
This document presents a system for recommending drugs based on machine learning sentiment analysis of drug reviews. The system aims to minimize the workload on experts by providing drug recommendations based on patient feedback. It uses various machine learning and deep learning techniques like naive bayes, logistic regression, LSTM, and GRU to analyze drug reviews and predict sentiment. Based on the sentiment analysis of features extracted from reviews, the system conditionally recommends medications for specific patients and conditions. It achieved up to 93% accuracy in recommending the top drugs for common conditions like acne, high blood pressure, anxiety, and depression. The system aims to enhance healthcare access and reduce medical errors by supplementing expert recommendations with an AI-powered recommendation platform.
1. The document describes a proposed medicine recommendation system that would apply data mining techniques to analyze diagnosis data and provide personalized medication recommendations to reduce medical errors.
2. It would consist of modules for a database, recommendation models, model evaluation, and data visualization. Different recommendation algorithms like SVM, neural networks, and decision trees would be investigated and tested on an open diagnosis data set.
3. The goal is to build an accurate and efficient recommendation framework to help doctors prescribe the right medications, especially for inexperienced doctors and rare diseases, by leveraging patterns found in historical medical records and diagnosis data.
This lecture defines key terms related to workflow analysis and process redesign. It describes the role of a healthcare workflow specialist and their skills in process analysis and redesign. It reviews the importance of workflow improvement and health IT adoption for patient safety and quality of care. The lecture also describes the Centers for Medicare and Medicaid Services' meaningful use program and incentives for adopting certified electronic health record technology.
Key Topics in Health Care Technology EvaluationThe amount of new i.docxsleeperfindley
Key Topics in Health Care Technology Evaluation
The amount of new information and data, and the number of available technologies are growing at an ever-accelerating rate. Did you know that during any given 24 hours, humanity generates enough new information to fill the Library of Congress 70 times (Smolan & Erwitt, 2012)? As a nurse informaticist, it is important to keep current on new developments in the field, but with the rapid pace of change, that effort can be overwhelming. It is easier to keep current with key trends if nurse informaticists focus on selected issues.
In this Discussion, you consider key topics in the field of health care technology. You then consider the different approaches you could take when designing an evaluation in these areas. For example, if you are interested in usability, your goal could be to determine if a system is user friendly from the viewpoint of a nurse. A different goal might be to determine if the location of the system facilitates ease of use from the viewpoint of physicians.
Note:
This Discussion serves as practice for the first part of your Evaluation Project. What you derive from your Discussion with colleagues will likely inform the work that you do in Part 1 of the Evaluation Project.
The Discussion focuses on the following major topics in the health care information field:
Implementing HIT Systems
Consumer health information
Computerized Physician Order Entry (CPOE)
Decision support systems
Electronic health records (EHR)
Tele-medicine and eHealth
Nursing documentation
Other Issues Related to the Use of HIT Systems
Interoperability
Unforeseen consequences
Usability
To prepare:
Select at least
two
topics from the
lists above
that are relevant to your current organization or that are of particular interest to you. Read the articles in this week’s Learning Resources that relate to these topics. Consider why these topics are of interest to you, what relevance they have to health care organizations, and how they impact your professional responsibilities. Choose one topic to be the focus of your Evaluation Project, and consider potential evaluation goals.
Determine the viewpoint from which you would approach the evaluation, and why.
By tomorrow, post a minimum of 550 words essay in APA format with a minimum of 3 references from the list of required resources below, that addresses the level one headings as numbered below:
1)
Post
the two topics you identified as most relevant to your organization or to you personally, and explain why you selected those topics.
2)
Identify the topic you selected for your Evaluation Project, and propose three potential evaluation goals for this topic.
3)
Identify the viewpoint you would use with each goal, and explain why.
Required Readings
Friedman, C. P., & Wyatt, J. C. (2010). Evaluation methods in biomedical informatics (2nd ed.). New York, NY: Springer Science+Business Media, Inc
.
Chapter 2, “Evaluation as a Field” (pp. 21–47)
This chapter defines.
NR 328 EBP Improving Diagnostic Safety Project.pdfbkbk37
1. The document outlines guidelines for an assignment on improving diagnostic safety through an evidence-based project. It includes sections on defining a clinical question, completing an evidence matrix table summarizing multiple research articles, describing the findings, and concluding.
2. It also discusses the Joint Commission's National Patient Safety Goal around timely reporting of critical test results to improve patient safety and prevent harm from treatment delays. The goal requires hospitals to define critical results and protocols for reporting them to clinicians within an agreed upon timeframe.
3. One study cited used focus groups to examine how education impacts dietary modifications in diabetes patients, finding that education improved compliance with dietary changes.
1Deliverable 3 - Evaluate Research and DataAttempt 2EttaBenton28
The research question examined how AI integration in clinical radiology could disrupt the industry. Two articles presented credible data by citing authors with advanced degrees from reputable institutions. Participants generally agreed AI improved accuracy and efficiency in radiology IT departments. However, some radiologists viewed AI negatively as a potential job threat. More research is needed to validate AI operations and address professional concerns to facilitate acceptance.
This document describes a research study using a MOOC to investigate how searching online for medical information affects the doctor-patient relationship. The MOOC asked participants to diagnose six medical case studies and complete surveys. Over 200 people participated, and data collection is complete. Preliminary results found high rates of online medical searching and a strong correlation between MOOC participants' diagnoses and doctors' diagnoses. The study aims to provide insights into the benefits and risks of online medical searching and its impact on healthcare. MOOCs show potential as a research tool but also have limitations like self-selection bias and low completion rates that must be considered.
Healthcare & digital marketing - Today & FutureDeepali Thakur
The document provides a summary of a digital marketing study for a healthcare company. It outlines the objectives, stages, research questions, hypotheses, and findings from secondary research, keyword trend analysis, website analysis, and primary research. The key findings include identifying priority health issues, regions, and media to focus digital efforts on, such as HIV, hepatitis, diabetes in Delhi, Tamil Nadu and analyzing gaps in patient and doctor knowledge websites.
Pico framework for framing systematic review research questions - PubricaPubrica
P Patient, problem, population
I ‑ Intervention, prognostic factor, exposure
C ‑ Comparison
O ‑ Outcome
Continue Reading: https://bit.ly/3igMAQ4
For our services: https://pubrica.com/services/research-services/systematic-review/
Why Pubrica:
When you order our services, We promise you the following – Plagiarism free | always on Time | 24*7 customer support | Written to international Standard | Unlimited Revisions support | Medical writing Expert | Publication Support | Biostatistical experts | High-quality Subject Matter Experts.
Contact us:
Web: https://pubrica.com/
Blog: https://pubrica.com/academy/
Email: sales@pubrica.com
WhatsApp : +91 9884350006
United Kingdom: +44- 74248 10299
Similar to Multimodal Question Answering in the Medical Domain (CMU/LTI 2020) | Dr. Asma Ben Abacha (20)
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
3. Artificial intelligence (AI) is playing an increasingly important role in our access to information. However,
a one-fits-all approach is suboptimal, especially in the medical domain where health-related information
is more sensitive due to its potential impact on public health, and where domain-specific aspects such
as technical language and case or context-based interpretation have to be taken into account. Bridging
the gap between several research areas such as AI, NLP, medical informatics, and computer vision is a
promising way to achieve reliable and efficient access to medical information.
In this talk, I will discuss some of my recent projects on multimodal question answering (QA) including
NLP methods for textual QA and visual question answering (VQA). In particular, I will present the lessons
learned from working on QA from trusted answer sources and alternative NLP approaches such as
recognizing question entailment and question summarization. In a second part, I will address the task of
VQA from radiology images and potential solutions to support the creation of large-scale training data
through visual question generation. Throughout the talk, I will present our recent efforts in creating
relevant datasets and new approaches as well as the challenges that we organized to promote research
in multimodal question answering.
Abstract
4. Disclaimer
The views and opinions expressed do not necessarily state or reflect
those of the U.S. Government, and they may not be used for
advertising or product endorsement purposes.
4
5. Plan
Introduction
Question Answering from Trusted Sources
1. Answering Questions about Medications
2. Summarization of Consumer Health Questions
3. Recognizing Question Entailment
VQA from Radiology Images
1. Visual Question Answering (VQA)
2. Visual Question Generation (VQG)
Discussion
5
6. INTRODUCTION
Information Retrieval & Question Answering (QA) in the Medical Domain
6
https://www.pewresearch.org/internet/2013/01/15/health-online-2013/
10. Early History of Question Answering
BASEBALL (Green et al.,1961): built to answer
questions about American baseball games.
LUNAR (Woods, 1973): built to answer scientists’ questions
about the Apollo 11 moon rocks.
QA @ TREC-8 (Voorhees, 1999).
10
QA @ CLEF Starting from 2003.
11. Early History of Medical QA
BioASQ challenges on biomedical semantic indexing and
question answering (started in 2012).
CLEF QA4MRE Alzheimer’s task (2012).
A summary of multimodal QA datasets and systems for the medical domain:
https://github.com/abachaa/Existing-Medical-QA-Datasets
Pierre Zweigenbaum. Question Answering in Biomedicine.
Workshop on NLP for QA, EACL 2003,
11
Dina Demner-Fushman & Jimmy Lin. Answer Extraction,
Semantic Clustering, and Extractive Summarization for Clinical
Question Answering. COLING/ACL 2006.
12. Part I: QA from Trusted Answer Sources 12
1) Answering Questions about Medications
13. 13
“Bridging the Gap between Consumers’ Medication Questions and
Trusted Answers”. Ben Abacha et al. MEDINFO 2019
Medical Question Answering about Medications
• Manual creation of the first gold standard
corpus of consumer health questions about
medications with associated answers and
annotations.
• Deep learning experiments on:
• question type identification (CNN),
• focus recognition (Bi-LSTM-CRF),
• question answering (CHiQA).
14. 1) Questions: The corpus contains 674 consumer questions submitted
to MedlinePlus.
14
Question Type # Example
Information 112 what type of drug is amphetamine?
Dose 70 what is a daily amount of prednisolone eye drops to take?
Usage 61 how to self inject enoxaparin sodium?
Side Effects 60 does benazepril aggravate hepatitis?
Indication 55 why is pyridostigmine prescribed?
Interaction 51 can i drink cataflam when i drink medrol?
15. 15
43%
19%
38%
DailyMed MedlinePlus Other Websites
2) Answer Sources: 43% of the questions were answered
from DailyMed and 19% from MedlinePlus.
• Other websites were used to answer 38% of the
questions (e.g. cdc.gov, mayoclinic.org, PubMed).
• External resources (e.g. eHealthMe) were needed
for specific types of questions such as Interactions.
16. Results
Results (%) F1 P R
Exact entity match 74.07 78.12 70.42
Partial entity match 90.37 95.31 85.92
Bi-LSTM-CRF Results for Focus Recognition:
CNN Results for Question Type Identification:
Accuracy 75.7%
16
17. Results
Answer Retrieval:
CHiQA found the correct answer in the top four results in
35% of the cases, only related answers for 35% of them and
irrelevant answers for the remaining 30%.
Our independent observations also hint that classical QA
systems may not be the best fit for medication questions.
17
18. 18
Several Challenges towards Automatic QA
Question Analysis:
Linguistic ambiguity due to misspellings, wrong grammar, etc.
The use of too many words or general terms to express a medical term.
The requested medical information was not formalized or written online.
Answer retrieval:
Conditional answers (e.g. depending on the manufacturer or on the disease).
Distributed answers formed only by combining different text snippets from different
sources.
Answers that could not be found without expert inference/knowledge.
19. Medical text translation, simplification, and inference are often needed to find relevant
answers and/or make the answers readable for non-expert users.
More data and resources are needed to cover information about drug interactions and
usage guidelines (e.g. scientific literature).
Conditional answers require different solutions such as providing a list of answers or
interacting with the user in a dialogue-based approach.
19
Solutions for the Automation of Medication QA
“Bridging the Gap between Consumers’ Medication Questions and
Trusted Answers”. Ben Abacha et al. MEDINFO 2019
20. Part I: QA from Trusted Answer Sources 20
2) Summarization of consumer health questions
23. 23
“On the Role of Question Summarization and Information Source
Restriction in Consumer Health Question Answering”.
Ben Abacha & Demner-Fushman. AMIA 2019 Informatics Summit.
Measures Original*
Questions
Question
Summaries
AvgScore (0-3) 0.711 1.125
Succ@2+ 0.442 0.663
Succ@3+ 0.192 0.317
Succ@4+ 0.077 0.144
Prec@2+ 0.46 0.663
Prec@3+ 0.2 0.317
Prec@4+ 0.08 0.144
QA Results w/(o) Manual Question Summarization
* Benchmark from the Medical Question
Answering Competition @ LiveQA 2017 (Ben
Abacha et al., TREC 2017).
Summarizing the
questions leads to a
substantial improvement.
24. Automatic Question Summarization: New Datasets
Method Type Example
#1 MeQSum Dataset Consumer
Health
Question
I suffered a massive stroke on [DATE] with paralysis on my left side of my body, I'm
home and conduct searches on the internet to find help with recovery, and always this
product called neuroaid appears claiming to restore function. to my knowledge it isn't
approved by the FDA, but it sounds so promising. do you know anything about it and id
there anything approved by our FDA, that does help?
Summary What are treatments for stroke paralysis, including neuroaid?
#2 Augmentation with
Clinical Data
Clinical
Question
55-year-old woman. This lady has epigastric pain and gallbladder symptoms. How do
you assess her gallbladder function when you don't see stones on the ultrasound? Can
a nonfunctioning gallbladder cause symptoms or do you only get symptoms if you have
stones?
Summary Can a nonfunctioning gallbladder cause symptoms or do you only get
symptoms if you have stones?
#3 Augmentation with
Semantic Selection
Medical
Question
Is it healthy to ingest 500 mg of vitamin c a day? Should I be taking more or
less?
Summary How much vitamin C should I take a day?
24
“On the Summarization of Consumer Health Questions”. Ben Abacha & Demner-Fushman. ACL 2019.
25. 25
Method Training
Set
ROUGE-1 ROUGE-2 ROUGE-L
Seq2seq
Attentional
Model
#1 24.80 13.84 24.27
#2 28.97 18.34 28.74
#3 27.62 15.70 27.11
Pointer
Generator
(PG)
#1 35.80 20.19 34.79
#2 42.77 25.00 40.97
#3 44.16 27.64 42.78
PG +
Coverage
#1 39.57 23.05 38.45
#2 40.00 24.13 38.56
#3 41.76 24.80 40.50
• #1 MeQSum Dataset
• #2 Augmentation with
Clinical Data
• #3 Augmentation with
Semantic Selection
Data augmentation
from question datasets
improves the overall
performance.
See et al., ACL 2017
Automatic Question Summarization: Results
26. 26
Question The best performance of 44.16% is
comparable to the state-of-the-art
results in open-domain using 8K
training pairs (2.5% of the size of
the CNN-DailyMail dataset).
Promising results considering the
low-frequency nature of most
medical entities.
Importance of data selection and
augmentation.
“On the Summarization of Consumer Health Questions”. Ben
Abacha & Demner-Fushman. ACL 2019.
Automatic Question Summarization: Results
27. Part I: QA from Trusted Answer Sources 27
3) Recognizing Question Entailment
28. Our Goal
Use existing question
& answer pairs for QA
Websites
Answering new questions by retrieving entailed questions with existing answers.
28
29. Recognizing Question Entailment (RQE) 29
• “A Question-Entailment Approach to Question Answering”. Ben
Abacha & Demner-Fushman. BMC Bioinformatics 2019.
30. The RQE-based QA system is a component of the CHiQA system
30
https://chiqa.nlm.nih.gov
31. MedQuAD dataset of 47k QA pairs:
Created from trusted resources (12 NIH websites).
Covers 36 question types about diseases and drugs.
https://github.com/abachaa/MedQuAD
31
The RQE-based QA System: Data
32. The RQE-based QA System: Results 32
Metrics RQE-based
QA System
LiveQA-
Med Best
LiveQA-Med
Median
Average
Score
0.827 0.637 0.431
MAP@10 0.311 -- --
MRR@10 0.333 -- --
Approach: Relying on the retrieval of
entailed questions is a viable strategy
to medical QA.
Data: Limiting the number of answer
sources could enhance the QA
performance.
• “A Question-Entailment Approach to Question Answering”. Ben
Abacha & Demner-Fushman. BMC Bioinformatics 2019.
33. o Logistic regression outperformed neural networks when trained with
traditional word embeddings such as glove and word2vec.
o Relying on recent language models such as Bert for pre-training
led to a better performance (MEDIQA-RQE 2019).
33
RQE Models
34. 34
“Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and
Question Answering”. Ben Abacha, Shivade & Demner-Fushman. ACL-BioNLP 2019.
Three tasks: NLI, RQE & QA
72 participating teams
20 published papers
35. MEDIQA – Post Challenge Round
https://www.aicrowd.com/organizers/mediqa-acl-bionlp
35
Submission open
36. To Sum up:
Relying on the retrieval of entailed questions is a viable strategy to answer
consumer health questions
Limiting the number of answer sources by using only trusted sources could
enhance the QA performance.
Transfer learning & multi-task learning improve the performance of RQE, NLI and
QA models in the medical domain.
Summarizing the questions leads to a substantial improvement.
More efforts are needed for building large datasets (for training and testing) and
efficient methods (e.g. medical text understanding and simplification).
36
38. 38
• VQA poses a challenging problem involving NLP and Computer Vision.
• Automatic understanding of radiology images and answering related
questions could support clinical education, clinical decision making, and
patient education.
Visual Question Answering in the Medical Domain
39. Four categories of questions: Modality, Plane, Abnormality & Organ
Training, validation, and test sets created automatically:
Training set: 3,200 radiology images and 12,792 question-answer pairs.
VQA-Med @ ImageCLEF 2019: Data 39
40. The best team achieved 0.624
accuracy and 0.644 BLEU score.
Methods: transfer learning, multi-
task learning, ensemble methods,
and hybrid approaches combining
classification models and answer
generation methods.
40
“VQA-Med: Overview of the Medical Visual Question Answering Task
at ImageCLEF 2019”. Ben Abacha et al. CLEF 2019.
VQA-Med @ ImageCLEF 2019: Results
41. Two Tasks:
1. Visual Question Answering (VQA)
2. Visual Question Generation (VQG)
• Datasets:
• VQA training set: 4,000 images with
4,000 QA pairs.
• VQG training set: 780 images with 2,156
questions.
• Submission open on April 22, 2020.
• Run submission deadline extended to:
June 5, 2020.
41
VQA-Med @ ImageCLEF 2020
43. Neuroimaging Collection (Ongoing)
11,500 radiology images of the brain and the head.
Several Tasks: Classification, VQA, VQG, and caption
generation.
Annotators: neuroradiologists and radiologists. Contact
me if you are interested to participate.
43
44. AI in the Medical Domain
Hippocratic
Oath
We cannot advance research without
“good” gold standard corpora and
efficient evaluation metrics.
We need more collaborations and
exchanges between medical
experts, AI, NLP & CV scientists.
Ethics
Evaluation
Interdisciplinarity
44
Data & Applications
45. Thank you for your Attention!
Email: asma.benabacha@nih.gov
Twitter: @AsmaBenAbacha
GitHub: abachaa
References
1. Asma Ben Abacha & Dina Demner-Fushman. On the Summarization of Consumer
Health Questions. ACL 2019.
2. Asma Ben Abacha, Chaitanya Shivade & Dina Demner-Fushman. Overview of the
MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question
Answering. ACL-BioNLP 2019.
3. Asma Ben Abacha & Dina Demner-Fushman. On the Role of Question
Summarization and Information Source Restriction in Consumer Health Question
Answering. AMIA 2019 Informatics Summit.
4. Asma Ben Abacha, Eugene Agichtein, Yuval Pinter & Dina Demner-Fushman.
Overview of the Medical QA Task @ TREC 2017 LiveQA Track. TREC 2017.
5. Asma Ben Abacha & Dina Demner-Fushman. Recognizing Question Entailment for
Medical Question Answering. AMIA 2016.
6. Asma Ben Abacha & Dina Demner-Fushman. A Question-Entailment Approach to
Question Answering. BMC Bioinformatics 2019.
7. Asma Ben Abacha, Yassine Mrabet, Mark Sharp, Travis Goodwin, Sonya E. Shooshan
& Dina Demner-Fushman. Bridging the Gap between Consumers’ Medication
Questions and Trusted Answers. MEDINFO 2019.
8. Jason J. Lau, Soumya Gayen, Asma Ben Abacha & Dina Demner-Fushman. A dataset
of clinically generated visual questions and answers about radiology images.
Scientific Data, Nature, 2018.
9. Asma Ben Abacha, Sadid A. Hasan, Vivek V. Datla, Joey Liu, Dina Demner-Fushman
& Henning Müller. VQA-Med: Overview of the Medical Visual Question Answering
Task at ImageCLEF 2019. CLEF 2019.
10. Visual Question Generation from Radiology Images. Mourad Sarrouti, Asma Ben
Abacha & Dina Demner-Fushman. ACL-ALVR 2020.
45