SlideShare a Scribd company logo
1 of 22
Insightsfrom the Organizationof InternationalChallengesonArtificial
Intelligencein Medical QuestionAnswering
Dr. Asma Ben Abacha
AKBC 2020 SciNLP Workshop, Invited Talk, June 25, 2020
@AsmaBenAbacha
DISCLAIMER
2
 The views and opinions expressed do not necessarily state
or reflect those of the U.S. Government, and they may not
be used for advertising or product endorsement purposes.
2
Challenges on NLP & QA
1. Recognizing Question Entailment (MEDIQA 2019)
2. Medical Question Answering (TREC 2017 & MEDIQA 2019)
Challenges on NLP & Computer Vision
1. Visual Question Answering (VQA-Med 2019 & 2020)
2. Visual Question Generation (VQA-Med 2020)
3
I) Targeted Tasks and Created Datasets
II) Discussion on Evaluation Methods and Shared Tasks/Challenges
Plan
Until recent years, only one relevant medical dataset: Corpus for Evidence Based Medicine Summarization.
D. Mollá, M.E. Santiago-Martinez (2011) https://sourceforge.net/projects/ebmsumcorpus/
EVALUATION OF MEDICAL QUESTION ANSWERING (QA) SYSTEMS
4
ORGANIZED CHALLENGES & SHARED TASKS IN THE MEDICAL DOMAIN
--
• “VQA-Med: Overview of the Medical Visual Question
Answering Task at ImageCLEF 2019”. Ben Abacha et
al. CLEF 2019 & 2020.
• “Overview of the MEDIQA 2019 Shared Task on Textual
Inference, Question Entailment and Question
Answering”. Ben Abacha, Shivade & Demner-Fushman.
ACL-BioNLP 2019.
• “Overview of the Medical Question Answering Task at TREC 2017 LiveQA”.
Ben Abacha, Agichtein, Pinter, Demner-Fushman. TREC 2017
5
 Targeted Tasks &
Created Datasets
1) MEDICAL QA TRACK @ TREC LIVEQA 2017
Question 1:
• Subject: ingredients in Kapvay. Message: Is there any sufites sulfates sulfa in Kapvay? I am allergic.
Question 2:
• Subject: abetalipoproteimemia. Message: hi, I would like to know if there is any support for those
suffering with abetalipoproteinemia? I am not diagnosed but have had many test that indicate I am
suffering with this, keen to learn how to get it diagnosed and how to manage, many thanks.
7
Overview of the Medical QA Task @ TREC 2017 LiveQA Track. Asma Ben Abacha,
Eugene Agichtein, Yuval Pinter & Dina Demner-Fushman. TREC 2017.
RESULTS
8
1) Recognizing Entailment/Inference in the Medical Domain
2) Entailment/Inference for Question Answering (QA)
 2006:
 2016:
Methods for Using Textual Entailment in Open-Domain Question
Answering. Sanda Harabagiu & Andrew Hickl
Recognizing Question Entailment for Medical Question Answering.
Asma Ben Abacha & Dina Demner-Fushman
2) MEDIQA @ ACL-BioNLP 2019
Dagan
et al. (2005)
Bowman
et al. (2015)
Williams
et al. (2018)
Shivade
et al. (2015)
Ben Abacha
et al. (2015)
Adler et al.
(2012)
Romanov
& Shivade
(2018)
Thousands of papers in open domain
Ben Abacha
& Demner-
Fushman
(2016/2019)
9
THE RQE-BASED QA SYSTEM
Metrics RQE-based
QA System
LiveQA-Med
Best
LiveQA-Med
Median
Average Score 0.827 0.637 0.431
MAP@10 0.311 -- --
MRR@10 0.333 -- --
10
“Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question Answering”. Asma
Ben Abacha, Chaitanya Shivade & Dina Demner-Fushman. ACL-BioNLP 2019.
Three tasks: NLI, RQE & QA
72 participating teams
20 published papers
11
MEDIQA 2019:
• Confirmed the added value of using textual
inference and question entailment in QA.
• Highlighted the strength of multi-task
learning, transfer learning, and data
augmentation methods.
• Showed/documented the power of new
architectures and models such as MT-DNN
in medical QA.
12
Submission open
MEDIQA – Post Challenge Round
o Four categories of questions: Modality, Plane, Abnormality & Organ
o Training, validation, and test sets created automatically:
• Training set: 3,200 radiology images and 12,792 question-answer pairs.
3) VQA-Med @ ImageCLEF 2019
13
“VQA-Med: Overview of the Medical Visual Question Answering Task at ImageCLEF 2019”. Ben
Abacha, Hasan, Datla, Liu, Demner-Fushman & Müller. CLEF 2019.
 The best team achieved 0.624 accuracy
and 0.644 BLEU score.
 Best methods: transfer learning, multi-
task learning, ensemble methods, and
hybrid approaches combining
classification models and answer
generation methods.
“VQA-Med: Overview of the Medical Visual Question Answering Task at ImageCLEF 2019”. Ben
Abacha, Hasan, Datla, Liu, Demner-Fushman & Müller. CLEF 2019.
14
VQA-Med @ ImageCLEF 2019
2) VQA-MED @ IMAGECLEF 2020
- Two Tasks:
1. Visual Question Answering (VQA)
2. Visual Question Generation (VQG)
- Datasets:
• VQA training set: 4,000 images and 4,000 QA pairs.
• VQG training set: 780 images and 2,156 questions.
- 11 teams submitted runs (the highest participation in
ImageCLEF 2020).
Overview of the VQA-Med Task at ImageCLEF 2020: Visual Question Answering and Generation in the Medical Domain.
Asma Ben Abacha, Sadid A. Hasan, Vivek V. Datla, Dina Demner-Fushman & Henning Müller. CLEF 2020.
15
 State-of-the art models in NLP and Computer Vision have been applied and
new approaches focusing on abnormality questions and images.
 Exciting results knowing the potential applications of the VQA and VQG tasks
in the medical domain.
VQA-MED @ IMAGECLEF 2020
 Discussion
Evaluation Metrics
VQA: Manual evaluation of the
accuracy of automatic answers vs.
BLEU score
A dataset of clinically generated visual questions and answers about radiology images.
Jason J. Lau, Soumya Gayen, Asma Ben Abacha & Dina Demner-Fushman. Scientific Data,
Nature, 2018.
Closed-ended Questions:
Open-ended Questions:
“On the Summarization of Consumer Health Questions”.
Ben Abacha & Demner-Fushman. ACL 2019.
20
Manual vs. Automatic Evaluation of Summarization Methods
 A major issue remains
the less reliable ranking
provided by existing
evaluation measures for
text generation tasks.
• Several Benefits:
 Discovering the range of potentially hidden obstacles and putting
systems’ performance in context.
 Creating useful gold standard corpora, training data, pilot studies,
and use cases (with the collaboration of medical experts in our case).
 Building strong research communities.
• Potential future research investigations:
 Efficient/suitable evaluation methods and metrics
 Important, newly identified, subtasks that need to be resolved first
 “Better” training and testing datasets
 Need for more multi-disciplinary efforts and more support and
participation in workshops and conferences such as TREC and CLEF. 21
CHALLENGES & SHARED TASKS/EVALUATIONS
Thank you for your
Attention!
e
References
1. Asma Ben Abacha & Pierre Zweigenbaum. MEANS: a Medical Question-Answering System Combining NLP
Techniques and Semantic Web Technologies. Information Processing & Management Journal, Elsevier, 2015.
2. Asma Ben Abacha & Dina Demner-Fushman. A Question-Entailment Approach to Question Answering. BMC
Bioinformatics 2019.
3. Asma Ben Abacha, Chaitanya Shivade & Dina Demner-Fushman. Overview of the MEDIQA 2019 Shared Task on
Textual Inference, Question Entailment and Question Answering. ACL-BioNLP 2019.
4. Asma Ben Abacha & Dina Demner-Fushman. On the Summarization of Consumer Health Questions”. ACL 2019.
5. Asma Ben Abacha & Dina Demner-Fushman. On the Role of Question Summarization and Information Source
Restriction in Consumer Health Question Answering. AMIA Informatics Summit 2019.
6. Asma Ben Abacha, Eugene Agichtein, Yuval Pinter & Dina Demner-Fushman. Overview of the Medical QA Task
@ TREC 2017 LiveQA Track. TREC 2017.
7. Asma Ben Abacha & Dina Demner-Fushman. Recognizing Question Entailment for Medical Question
Answering. AMIA 2016.
8. Asma Ben Abacha, Yassine Mrabet, Mark Sharp, Travis Goodwin, Sonya E. Shooshan & Dina Demner-Fushman.
Bridging the Gap between Consumers’ Medication Questions and Trusted Answers. MEDINFO 2019.
9. Jason J. Lau, Soumya Gayen, Asma Ben Abacha & Dina Demner-Fushman. A dataset of clinically generated
visual questions and answers about radiology images. Scientific Data, Nature, 2018.
10. Asma Ben Abacha, Sadid A. Hasan, Vivek V. Datla, Joey Liu, Dina Demner-Fushman & Henning Müller. VQA-
Med: Overview of the Medical Visual Question Answering Task at ImageCLEF 2019. CLEF 2019.
11. Asma Ben Abacha, Sadid A. Hasan, Vivek V. Datla, Dina Demner-Fushman & Henning Müller. Overview of the
VQA-Med Task at ImageCLEF 2020: Visual Question Answering and Generation in the Medical Domain. CLEF
2020.
12. Visual Question Generation from Radiology Images. Mourad Sarrouti, Asma Ben Abacha & Dina Demner-
Fushman. ACL-ALVR 2020.
asma.benabacha@nih.gov
asma.benabacha@gmail.com
@AsmaBenAbacha
22

More Related Content

What's hot

Contemporary Memo, Merging technologies in the workforce
Contemporary Memo, Merging technologies in the workforceContemporary Memo, Merging technologies in the workforce
Contemporary Memo, Merging technologies in the workforceWarren Sprecher
 
Rock Health Demo Day Notes
Rock Health Demo Day NotesRock Health Demo Day Notes
Rock Health Demo Day NotesLindsay Meyer
 
2019.10.30 DayOne Expert Event patient centricity
2019.10.30 DayOne Expert Event patient centricity2019.10.30 DayOne Expert Event patient centricity
2019.10.30 DayOne Expert Event patient centricityDayOne
 
Applying deep learning to medical data
Applying deep learning to medical dataApplying deep learning to medical data
Applying deep learning to medical dataHyun-seok Min
 
Taking cognitive/ emotional assessments and therapies to scale
Taking cognitive/ emotional assessments and therapies to scaleTaking cognitive/ emotional assessments and therapies to scale
Taking cognitive/ emotional assessments and therapies to scaleSharpBrains
 
(2017/06)Practical points of deep learning for medical imaging
(2017/06)Practical points of deep learning for medical imaging(2017/06)Practical points of deep learning for medical imaging
(2017/06)Practical points of deep learning for medical imagingKyuhwan Jung
 
The RSNA Image Share Network: Initial 12 Month Results from the UCSF Pilot Site
The RSNA Image Share Network: Initial 12 Month Results from the UCSF Pilot SiteThe RSNA Image Share Network: Initial 12 Month Results from the UCSF Pilot Site
The RSNA Image Share Network: Initial 12 Month Results from the UCSF Pilot SiteTrimed Media Group
 
Clinical Trial Accrual Challenges: Is Social Media Here to Help? (A. Denicoff)
Clinical Trial Accrual Challenges: Is Social Media Here to Help? (A. Denicoff)Clinical Trial Accrual Challenges: Is Social Media Here to Help? (A. Denicoff)
Clinical Trial Accrual Challenges: Is Social Media Here to Help? (A. Denicoff)Esmeralda Casas-Silva, Ph.D.
 
Deep learning for episodic interventional data
Deep learning for episodic interventional dataDeep learning for episodic interventional data
Deep learning for episodic interventional dataDeakin University
 
Sleep quality prediction from wearable data using deep learning
Sleep quality prediction from wearable data using deep learningSleep quality prediction from wearable data using deep learning
Sleep quality prediction from wearable data using deep learningLuis Fernandez Luque
 
Predictive analytics for personalized healthcare
Predictive analytics for personalized healthcarePredictive analytics for personalized healthcare
Predictive analytics for personalized healthcareJohn Cai
 
Outsmarting Smart Technology to Reclaim our Health and Focus
Outsmarting Smart Technology to Reclaim our Health and FocusOutsmarting Smart Technology to Reclaim our Health and Focus
Outsmarting Smart Technology to Reclaim our Health and FocusSharpBrains
 
Medical Informatics: Computational Analytics in Healthcare
Medical Informatics: Computational Analytics in HealthcareMedical Informatics: Computational Analytics in Healthcare
Medical Informatics: Computational Analytics in HealthcareNUS-ISS
 
HEC 2016 Panel: Putting User-Generated Data in Action: Improving Interpretabi...
HEC 2016 Panel: Putting User-Generated Data in Action: Improving Interpretabi...HEC 2016 Panel: Putting User-Generated Data in Action: Improving Interpretabi...
HEC 2016 Panel: Putting User-Generated Data in Action: Improving Interpretabi...Pei-Yun Sabrina Hsueh
 
What do 7.5 billion human brains need to thrive in the Digital Age, and what ...
What do 7.5 billion human brains need to thrive in the Digital Age, and what ...What do 7.5 billion human brains need to thrive in the Digital Age, and what ...
What do 7.5 billion human brains need to thrive in the Digital Age, and what ...SharpBrains
 
한국에서 혁신적인 디지털 헬스케어 스타트업이 탄생하려면
한국에서 혁신적인 디지털 헬스케어 스타트업이 탄생하려면한국에서 혁신적인 디지털 헬스케어 스타트업이 탄생하려면
한국에서 혁신적인 디지털 헬스케어 스타트업이 탄생하려면Yoon Sup Choi
 
New Normal, New Future - Free Download E book
New Normal, New Future - Free Download E bookNew Normal, New Future - Free Download E book
New Normal, New Future - Free Download E bookkevin brown
 
Supporting therapists to integrate virtual reality systems within clinical pr...
Supporting therapists to integrate virtual reality systems within clinical pr...Supporting therapists to integrate virtual reality systems within clinical pr...
Supporting therapists to integrate virtual reality systems within clinical pr...Stephanie Glegg
 

What's hot (20)

Contemporary Memo, Merging technologies in the workforce
Contemporary Memo, Merging technologies in the workforceContemporary Memo, Merging technologies in the workforce
Contemporary Memo, Merging technologies in the workforce
 
Rock Health Demo Day Notes
Rock Health Demo Day NotesRock Health Demo Day Notes
Rock Health Demo Day Notes
 
2019.10.30 DayOne Expert Event patient centricity
2019.10.30 DayOne Expert Event patient centricity2019.10.30 DayOne Expert Event patient centricity
2019.10.30 DayOne Expert Event patient centricity
 
Applying deep learning to medical data
Applying deep learning to medical dataApplying deep learning to medical data
Applying deep learning to medical data
 
Taking cognitive/ emotional assessments and therapies to scale
Taking cognitive/ emotional assessments and therapies to scaleTaking cognitive/ emotional assessments and therapies to scale
Taking cognitive/ emotional assessments and therapies to scale
 
(2017/06)Practical points of deep learning for medical imaging
(2017/06)Practical points of deep learning for medical imaging(2017/06)Practical points of deep learning for medical imaging
(2017/06)Practical points of deep learning for medical imaging
 
The RSNA Image Share Network: Initial 12 Month Results from the UCSF Pilot Site
The RSNA Image Share Network: Initial 12 Month Results from the UCSF Pilot SiteThe RSNA Image Share Network: Initial 12 Month Results from the UCSF Pilot Site
The RSNA Image Share Network: Initial 12 Month Results from the UCSF Pilot Site
 
Clinical Trial Accrual Challenges: Is Social Media Here to Help? (A. Denicoff)
Clinical Trial Accrual Challenges: Is Social Media Here to Help? (A. Denicoff)Clinical Trial Accrual Challenges: Is Social Media Here to Help? (A. Denicoff)
Clinical Trial Accrual Challenges: Is Social Media Here to Help? (A. Denicoff)
 
Deep learning for episodic interventional data
Deep learning for episodic interventional dataDeep learning for episodic interventional data
Deep learning for episodic interventional data
 
Sleep quality prediction from wearable data using deep learning
Sleep quality prediction from wearable data using deep learningSleep quality prediction from wearable data using deep learning
Sleep quality prediction from wearable data using deep learning
 
Predictive analytics for personalized healthcare
Predictive analytics for personalized healthcarePredictive analytics for personalized healthcare
Predictive analytics for personalized healthcare
 
Outsmarting Smart Technology to Reclaim our Health and Focus
Outsmarting Smart Technology to Reclaim our Health and FocusOutsmarting Smart Technology to Reclaim our Health and Focus
Outsmarting Smart Technology to Reclaim our Health and Focus
 
Research Poster
Research PosterResearch Poster
Research Poster
 
AI in eHealth
AI in eHealthAI in eHealth
AI in eHealth
 
Medical Informatics: Computational Analytics in Healthcare
Medical Informatics: Computational Analytics in HealthcareMedical Informatics: Computational Analytics in Healthcare
Medical Informatics: Computational Analytics in Healthcare
 
HEC 2016 Panel: Putting User-Generated Data in Action: Improving Interpretabi...
HEC 2016 Panel: Putting User-Generated Data in Action: Improving Interpretabi...HEC 2016 Panel: Putting User-Generated Data in Action: Improving Interpretabi...
HEC 2016 Panel: Putting User-Generated Data in Action: Improving Interpretabi...
 
What do 7.5 billion human brains need to thrive in the Digital Age, and what ...
What do 7.5 billion human brains need to thrive in the Digital Age, and what ...What do 7.5 billion human brains need to thrive in the Digital Age, and what ...
What do 7.5 billion human brains need to thrive in the Digital Age, and what ...
 
한국에서 혁신적인 디지털 헬스케어 스타트업이 탄생하려면
한국에서 혁신적인 디지털 헬스케어 스타트업이 탄생하려면한국에서 혁신적인 디지털 헬스케어 스타트업이 탄생하려면
한국에서 혁신적인 디지털 헬스케어 스타트업이 탄생하려면
 
New Normal, New Future - Free Download E book
New Normal, New Future - Free Download E bookNew Normal, New Future - Free Download E book
New Normal, New Future - Free Download E book
 
Supporting therapists to integrate virtual reality systems within clinical pr...
Supporting therapists to integrate virtual reality systems within clinical pr...Supporting therapists to integrate virtual reality systems within clinical pr...
Supporting therapists to integrate virtual reality systems within clinical pr...
 

Similar to Insights from the Organization of International Challenges on Artificial Intelligence in Medical Question Answering (SciNLP 2020) | Dr. Asma Ben Abacha

PDAs for Nursing Students: Technology at Your Fingertips
PDAs for Nursing Students: Technology at Your FingertipsPDAs for Nursing Students: Technology at Your Fingertips
PDAs for Nursing Students: Technology at Your FingertipsCynthia.Russell
 
Hst921 Intro 2009 Dec 16
Hst921 Intro 2009 Dec 16Hst921 Intro 2009 Dec 16
Hst921 Intro 2009 Dec 16slockemd
 
Quality Forum new technologies (sessionD7)
Quality Forum new technologies (sessionD7)Quality Forum new technologies (sessionD7)
Quality Forum new technologies (sessionD7)MedEdHelen
 
Informatics and nursing 2015 2016.odette richards
Informatics and nursing 2015 2016.odette richardsInformatics and nursing 2015 2016.odette richards
Informatics and nursing 2015 2016.odette richardsOdette Richards
 
Big Data Means Big Potential Challenges for Nurse Execs Response.pdf
Big Data Means Big Potential Challenges for Nurse Execs Response.pdfBig Data Means Big Potential Challenges for Nurse Execs Response.pdf
Big Data Means Big Potential Challenges for Nurse Execs Response.pdfbkbk37
 
panel-medical-education-in-the-21st-century.pptx
panel-medical-education-in-the-21st-century.pptxpanel-medical-education-in-the-21st-century.pptx
panel-medical-education-in-the-21st-century.pptxJeffO14
 
Walden University NURS 6050 Polic
 Walden University   NURS 6050 Polic Walden University   NURS 6050 Polic
Walden University NURS 6050 PolicMoseStaton39
 
Example Dissertation Proposal Defense Power Point Slide
Example Dissertation Proposal Defense Power Point SlideExample Dissertation Proposal Defense Power Point Slide
Example Dissertation Proposal Defense Power Point SlideDr. Vince Bridges
 
NHSFPX 4000 Capella University Eliminating Medical Errors Bibliography.docx
NHSFPX 4000 Capella University Eliminating Medical Errors Bibliography.docxNHSFPX 4000 Capella University Eliminating Medical Errors Bibliography.docx
NHSFPX 4000 Capella University Eliminating Medical Errors Bibliography.docxwrite5
 
· On the basis of what you learned in the readings, define the t
· On the basis of what you learned in the readings, define the t· On the basis of what you learned in the readings, define the t
· On the basis of what you learned in the readings, define the tLesleyWhitesidefv
 
Mental Health - Virtual Standardized Patient Study
Mental Health -  Virtual Standardized Patient StudyMental Health -  Virtual Standardized Patient Study
Mental Health - Virtual Standardized Patient StudyNellSaladin
 
Cat ortho /certified fixed orthodontic courses by Indian dental academy
Cat ortho /certified fixed orthodontic courses by Indian dental academy Cat ortho /certified fixed orthodontic courses by Indian dental academy
Cat ortho /certified fixed orthodontic courses by Indian dental academy Indian dental academy
 
Technology and Healthcare Communications
Technology and Healthcare CommunicationsTechnology and Healthcare Communications
Technology and Healthcare CommunicationsLawrence Sherman
 
Grand Canyon University Primary Care Shortage Response HW.pdf
Grand Canyon University Primary Care Shortage Response HW.pdfGrand Canyon University Primary Care Shortage Response HW.pdf
Grand Canyon University Primary Care Shortage Response HW.pdfBrian712019
 
Closing the Loop on Clinical Competency Based Assessments
 Closing the Loop on Clinical Competency Based Assessments Closing the Loop on Clinical Competency Based Assessments
Closing the Loop on Clinical Competency Based AssessmentsExamSoft
 
Example Final Defense Power Point Slide
Example Final Defense Power Point SlideExample Final Defense Power Point Slide
Example Final Defense Power Point SlideDr. Vince Bridges
 
Walden ADHD Translating Evidence into Practice Data Collection Assignment.pdf
Walden ADHD Translating Evidence into Practice Data Collection Assignment.pdfWalden ADHD Translating Evidence into Practice Data Collection Assignment.pdf
Walden ADHD Translating Evidence into Practice Data Collection Assignment.pdfsdfghj21
 

Similar to Insights from the Organization of International Challenges on Artificial Intelligence in Medical Question Answering (SciNLP 2020) | Dr. Asma Ben Abacha (20)

PDAs for Nursing Students: Technology at Your Fingertips
PDAs for Nursing Students: Technology at Your FingertipsPDAs for Nursing Students: Technology at Your Fingertips
PDAs for Nursing Students: Technology at Your Fingertips
 
Hst921 Intro 2009 Dec 16
Hst921 Intro 2009 Dec 16Hst921 Intro 2009 Dec 16
Hst921 Intro 2009 Dec 16
 
Quality Forum new technologies (sessionD7)
Quality Forum new technologies (sessionD7)Quality Forum new technologies (sessionD7)
Quality Forum new technologies (sessionD7)
 
Informatics and nursing 2015 2016.odette richards
Informatics and nursing 2015 2016.odette richardsInformatics and nursing 2015 2016.odette richards
Informatics and nursing 2015 2016.odette richards
 
Big Data Means Big Potential Challenges for Nurse Execs Response.pdf
Big Data Means Big Potential Challenges for Nurse Execs Response.pdfBig Data Means Big Potential Challenges for Nurse Execs Response.pdf
Big Data Means Big Potential Challenges for Nurse Execs Response.pdf
 
Abbott presentation
Abbott presentationAbbott presentation
Abbott presentation
 
panel-medical-education-in-the-21st-century.pptx
panel-medical-education-in-the-21st-century.pptxpanel-medical-education-in-the-21st-century.pptx
panel-medical-education-in-the-21st-century.pptx
 
Walden University NURS 6050 Polic
 Walden University   NURS 6050 Polic Walden University   NURS 6050 Polic
Walden University NURS 6050 Polic
 
0201 Daniel Rachford & Zack Pemberton - Evidence based Advocacy
0201 Daniel Rachford & Zack Pemberton - Evidence based Advocacy0201 Daniel Rachford & Zack Pemberton - Evidence based Advocacy
0201 Daniel Rachford & Zack Pemberton - Evidence based Advocacy
 
Example Dissertation Proposal Defense Power Point Slide
Example Dissertation Proposal Defense Power Point SlideExample Dissertation Proposal Defense Power Point Slide
Example Dissertation Proposal Defense Power Point Slide
 
NHSFPX 4000 Capella University Eliminating Medical Errors Bibliography.docx
NHSFPX 4000 Capella University Eliminating Medical Errors Bibliography.docxNHSFPX 4000 Capella University Eliminating Medical Errors Bibliography.docx
NHSFPX 4000 Capella University Eliminating Medical Errors Bibliography.docx
 
· On the basis of what you learned in the readings, define the t
· On the basis of what you learned in the readings, define the t· On the basis of what you learned in the readings, define the t
· On the basis of what you learned in the readings, define the t
 
Mental Health - Virtual Standardized Patient Study
Mental Health -  Virtual Standardized Patient StudyMental Health -  Virtual Standardized Patient Study
Mental Health - Virtual Standardized Patient Study
 
Cat ortho /certified fixed orthodontic courses by Indian dental academy
Cat ortho /certified fixed orthodontic courses by Indian dental academy Cat ortho /certified fixed orthodontic courses by Indian dental academy
Cat ortho /certified fixed orthodontic courses by Indian dental academy
 
Future Solutions from Qualitative Big Data
Future Solutions from Qualitative Big Data Future Solutions from Qualitative Big Data
Future Solutions from Qualitative Big Data
 
Technology and Healthcare Communications
Technology and Healthcare CommunicationsTechnology and Healthcare Communications
Technology and Healthcare Communications
 
Grand Canyon University Primary Care Shortage Response HW.pdf
Grand Canyon University Primary Care Shortage Response HW.pdfGrand Canyon University Primary Care Shortage Response HW.pdf
Grand Canyon University Primary Care Shortage Response HW.pdf
 
Closing the Loop on Clinical Competency Based Assessments
 Closing the Loop on Clinical Competency Based Assessments Closing the Loop on Clinical Competency Based Assessments
Closing the Loop on Clinical Competency Based Assessments
 
Example Final Defense Power Point Slide
Example Final Defense Power Point SlideExample Final Defense Power Point Slide
Example Final Defense Power Point Slide
 
Walden ADHD Translating Evidence into Practice Data Collection Assignment.pdf
Walden ADHD Translating Evidence into Practice Data Collection Assignment.pdfWalden ADHD Translating Evidence into Practice Data Collection Assignment.pdf
Walden ADHD Translating Evidence into Practice Data Collection Assignment.pdf
 

Recently uploaded

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Insights from the Organization of International Challenges on Artificial Intelligence in Medical Question Answering (SciNLP 2020) | Dr. Asma Ben Abacha

  • 1. Insightsfrom the Organizationof InternationalChallengesonArtificial Intelligencein Medical QuestionAnswering Dr. Asma Ben Abacha AKBC 2020 SciNLP Workshop, Invited Talk, June 25, 2020 @AsmaBenAbacha
  • 2. DISCLAIMER 2  The views and opinions expressed do not necessarily state or reflect those of the U.S. Government, and they may not be used for advertising or product endorsement purposes. 2
  • 3. Challenges on NLP & QA 1. Recognizing Question Entailment (MEDIQA 2019) 2. Medical Question Answering (TREC 2017 & MEDIQA 2019) Challenges on NLP & Computer Vision 1. Visual Question Answering (VQA-Med 2019 & 2020) 2. Visual Question Generation (VQA-Med 2020) 3 I) Targeted Tasks and Created Datasets II) Discussion on Evaluation Methods and Shared Tasks/Challenges Plan
  • 4. Until recent years, only one relevant medical dataset: Corpus for Evidence Based Medicine Summarization. D. Mollá, M.E. Santiago-Martinez (2011) https://sourceforge.net/projects/ebmsumcorpus/ EVALUATION OF MEDICAL QUESTION ANSWERING (QA) SYSTEMS 4
  • 5. ORGANIZED CHALLENGES & SHARED TASKS IN THE MEDICAL DOMAIN -- • “VQA-Med: Overview of the Medical Visual Question Answering Task at ImageCLEF 2019”. Ben Abacha et al. CLEF 2019 & 2020. • “Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question Answering”. Ben Abacha, Shivade & Demner-Fushman. ACL-BioNLP 2019. • “Overview of the Medical Question Answering Task at TREC 2017 LiveQA”. Ben Abacha, Agichtein, Pinter, Demner-Fushman. TREC 2017 5
  • 6.  Targeted Tasks & Created Datasets
  • 7. 1) MEDICAL QA TRACK @ TREC LIVEQA 2017 Question 1: • Subject: ingredients in Kapvay. Message: Is there any sufites sulfates sulfa in Kapvay? I am allergic. Question 2: • Subject: abetalipoproteimemia. Message: hi, I would like to know if there is any support for those suffering with abetalipoproteinemia? I am not diagnosed but have had many test that indicate I am suffering with this, keen to learn how to get it diagnosed and how to manage, many thanks. 7 Overview of the Medical QA Task @ TREC 2017 LiveQA Track. Asma Ben Abacha, Eugene Agichtein, Yuval Pinter & Dina Demner-Fushman. TREC 2017.
  • 9. 1) Recognizing Entailment/Inference in the Medical Domain 2) Entailment/Inference for Question Answering (QA)  2006:  2016: Methods for Using Textual Entailment in Open-Domain Question Answering. Sanda Harabagiu & Andrew Hickl Recognizing Question Entailment for Medical Question Answering. Asma Ben Abacha & Dina Demner-Fushman 2) MEDIQA @ ACL-BioNLP 2019 Dagan et al. (2005) Bowman et al. (2015) Williams et al. (2018) Shivade et al. (2015) Ben Abacha et al. (2015) Adler et al. (2012) Romanov & Shivade (2018) Thousands of papers in open domain Ben Abacha & Demner- Fushman (2016/2019) 9
  • 10. THE RQE-BASED QA SYSTEM Metrics RQE-based QA System LiveQA-Med Best LiveQA-Med Median Average Score 0.827 0.637 0.431 MAP@10 0.311 -- -- MRR@10 0.333 -- -- 10
  • 11. “Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question Answering”. Asma Ben Abacha, Chaitanya Shivade & Dina Demner-Fushman. ACL-BioNLP 2019. Three tasks: NLI, RQE & QA 72 participating teams 20 published papers 11
  • 12. MEDIQA 2019: • Confirmed the added value of using textual inference and question entailment in QA. • Highlighted the strength of multi-task learning, transfer learning, and data augmentation methods. • Showed/documented the power of new architectures and models such as MT-DNN in medical QA. 12 Submission open MEDIQA – Post Challenge Round
  • 13. o Four categories of questions: Modality, Plane, Abnormality & Organ o Training, validation, and test sets created automatically: • Training set: 3,200 radiology images and 12,792 question-answer pairs. 3) VQA-Med @ ImageCLEF 2019 13 “VQA-Med: Overview of the Medical Visual Question Answering Task at ImageCLEF 2019”. Ben Abacha, Hasan, Datla, Liu, Demner-Fushman & Müller. CLEF 2019.
  • 14.  The best team achieved 0.624 accuracy and 0.644 BLEU score.  Best methods: transfer learning, multi- task learning, ensemble methods, and hybrid approaches combining classification models and answer generation methods. “VQA-Med: Overview of the Medical Visual Question Answering Task at ImageCLEF 2019”. Ben Abacha, Hasan, Datla, Liu, Demner-Fushman & Müller. CLEF 2019. 14 VQA-Med @ ImageCLEF 2019
  • 15. 2) VQA-MED @ IMAGECLEF 2020 - Two Tasks: 1. Visual Question Answering (VQA) 2. Visual Question Generation (VQG) - Datasets: • VQA training set: 4,000 images and 4,000 QA pairs. • VQG training set: 780 images and 2,156 questions. - 11 teams submitted runs (the highest participation in ImageCLEF 2020). Overview of the VQA-Med Task at ImageCLEF 2020: Visual Question Answering and Generation in the Medical Domain. Asma Ben Abacha, Sadid A. Hasan, Vivek V. Datla, Dina Demner-Fushman & Henning Müller. CLEF 2020. 15
  • 16.  State-of-the art models in NLP and Computer Vision have been applied and new approaches focusing on abnormality questions and images.  Exciting results knowing the potential applications of the VQA and VQG tasks in the medical domain. VQA-MED @ IMAGECLEF 2020
  • 18. Evaluation Metrics VQA: Manual evaluation of the accuracy of automatic answers vs. BLEU score A dataset of clinically generated visual questions and answers about radiology images. Jason J. Lau, Soumya Gayen, Asma Ben Abacha & Dina Demner-Fushman. Scientific Data, Nature, 2018. Closed-ended Questions:
  • 20. “On the Summarization of Consumer Health Questions”. Ben Abacha & Demner-Fushman. ACL 2019. 20 Manual vs. Automatic Evaluation of Summarization Methods  A major issue remains the less reliable ranking provided by existing evaluation measures for text generation tasks.
  • 21. • Several Benefits:  Discovering the range of potentially hidden obstacles and putting systems’ performance in context.  Creating useful gold standard corpora, training data, pilot studies, and use cases (with the collaboration of medical experts in our case).  Building strong research communities. • Potential future research investigations:  Efficient/suitable evaluation methods and metrics  Important, newly identified, subtasks that need to be resolved first  “Better” training and testing datasets  Need for more multi-disciplinary efforts and more support and participation in workshops and conferences such as TREC and CLEF. 21 CHALLENGES & SHARED TASKS/EVALUATIONS
  • 22. Thank you for your Attention! e References 1. Asma Ben Abacha & Pierre Zweigenbaum. MEANS: a Medical Question-Answering System Combining NLP Techniques and Semantic Web Technologies. Information Processing & Management Journal, Elsevier, 2015. 2. Asma Ben Abacha & Dina Demner-Fushman. A Question-Entailment Approach to Question Answering. BMC Bioinformatics 2019. 3. Asma Ben Abacha, Chaitanya Shivade & Dina Demner-Fushman. Overview of the MEDIQA 2019 Shared Task on Textual Inference, Question Entailment and Question Answering. ACL-BioNLP 2019. 4. Asma Ben Abacha & Dina Demner-Fushman. On the Summarization of Consumer Health Questions”. ACL 2019. 5. Asma Ben Abacha & Dina Demner-Fushman. On the Role of Question Summarization and Information Source Restriction in Consumer Health Question Answering. AMIA Informatics Summit 2019. 6. Asma Ben Abacha, Eugene Agichtein, Yuval Pinter & Dina Demner-Fushman. Overview of the Medical QA Task @ TREC 2017 LiveQA Track. TREC 2017. 7. Asma Ben Abacha & Dina Demner-Fushman. Recognizing Question Entailment for Medical Question Answering. AMIA 2016. 8. Asma Ben Abacha, Yassine Mrabet, Mark Sharp, Travis Goodwin, Sonya E. Shooshan & Dina Demner-Fushman. Bridging the Gap between Consumers’ Medication Questions and Trusted Answers. MEDINFO 2019. 9. Jason J. Lau, Soumya Gayen, Asma Ben Abacha & Dina Demner-Fushman. A dataset of clinically generated visual questions and answers about radiology images. Scientific Data, Nature, 2018. 10. Asma Ben Abacha, Sadid A. Hasan, Vivek V. Datla, Joey Liu, Dina Demner-Fushman & Henning Müller. VQA- Med: Overview of the Medical Visual Question Answering Task at ImageCLEF 2019. CLEF 2019. 11. Asma Ben Abacha, Sadid A. Hasan, Vivek V. Datla, Dina Demner-Fushman & Henning Müller. Overview of the VQA-Med Task at ImageCLEF 2020: Visual Question Answering and Generation in the Medical Domain. CLEF 2020. 12. Visual Question Generation from Radiology Images. Mourad Sarrouti, Asma Ben Abacha & Dina Demner- Fushman. ACL-ALVR 2020. asma.benabacha@nih.gov asma.benabacha@gmail.com @AsmaBenAbacha 22