SlideShare a Scribd company logo
Yifan Peng1, Xiaosong Wang2, Le Lu2, Mohammadhadi Bagheri2,
Ronald Summers2, Zhiyong Lu1
1 National Center for Biotechnology Information, NLM, NIH
2 Clinical Center, NIH
Twitter: #AMIA2018
NegBio: a high-performance tool for
negation and uncertainty detection in
radiology reports
Oral Presentations – Imaging, S41
• The availability of well-labeled data is the key for large scale machine learning, e.g. deep
learning
• Hospitals have accumulated a large number of raw radiology images and reports
• Conventional ways for collecting image labels are NOT applicable
• the security and privacy issues
• requires comprehension of domain-specific medical knowledge
All Start with Data
Large scale natural image datasets
Large scale
Medical Image dataset
2AMIA 2018 | amia.org
Overview
Mining image labels via NLP for multi-label pathology classification
3AMIA 2018 | amia.org
One of ImageNet pre-trained models
GoogLeNet ResNetVggNetAlexNet
Weights from
predication layer
Pooling Layer
NE
recognizer
(MetaMap)
Negative/
Equivocal
detection
Labels
Image
data
conv1/7x7_s2
conv1/rule_7x7
inception_5b/
output
data
conv1
res5c
res5c_relu
data
conv1_1
relu1_1
conv5_3
relu5_3
data
conv1
conv1
conv5
relu5
MAX LSE AVE
Transition
Layer
A Sample Entry
Image Report Label
Findings: pa and lateral views of the
chest demonstrate significantly
improved bilateral lower lung field
interstitial markings compatible with
linear atelectasis. unchanged right
9th rib fracture peripherally.
unchanged ossification left
coracoacromial ligament. the cardiac
and mediastinal contours are stable.
Impression: improved bilateral lower
lung field linear atelectasis.
Atelectasis
4AMIA 2018 | amia.org
14 Common Thorax Diseases
• Atelectasis
• Cardiomegaly
• Consolidation
• Edema
• Effusion
• Emphysema
• Fibrosis
• Hernia
• Infiltration
• Mass
• Nodule
• Pleural Thickening
• Pneumonia
• Pneumothorax
5AMIA 2018 | amia.org
Challenges
Negative and equivocal findings may indicate the absence of findings
mentioned within the radiology report
Findings: right internal jugular catheter remains in place. Large metastatic lung mass
in the lateral left upper lobe is again noted. No infiltrate or effusion. Extensive
surgical clips again noted left axilla.
Impression: no significant change.
Reason for exam (entered by ordering clinician into cris): bilateral pneumonia no
change in the tracheostomy tube or right internal jugular venous catheter. Unchanged
bilateral alveolar infiltrates, fluid in the right minor fissure, lucency at the right
costophrenic angle suggesting pneumonia. Overall, no significant change
6AMIA 2018 | amia.org
Related Work
Chapman W, et al. A simple algorithm for identifying negated findings and diseases in
discharge summaries. Journal of Biomedical Informatics. 2001;34:301-310.
Harkema H, et al. ConText: an algorithm for determining negation, experiencer, and
temporal status from clinical reports. Journal of biomedical informatics. 2009;42:839-851.
Mutalik P, et al. Use of general-purpose negation detection to augment concept indexing
of medical documents: a quantitative study using the UMLS. Journal of the American
Medical Informatics Association. 2001;8:598-609.
Sohn S, Wu S, Chute C. Dependency parser-based negation detection in clinical
narratives. In AMIA Summits on Translational Science proceedings AMIA Summit on
Translational Science. 2012;2012:1-8.
Mehrabi S, et al. DEEPEN: A negation detection system for clinical text incorporating
dependency relation into NegEx. Journal of Biomedical Informatics. 2015;54:213-219.
7AMIA 2018 | amia.org
Related Work
Ogren P, et al. Constructing evaluation corpora for automated clinical named entity
recognition. In Proceedings of the Sixth International Conference on Language
Resources and Evaluation (LREC'08). 2008;28-30.
Uzuner South B, et al. 2010 i2b2/VA challenge on concepts, assertions, and relations in
clinical text. Journal of the American Medical Informatics Association. 2011;18:552-556.
Suominen H, et al. Overview of the ShARe/CLEF eHealth evaluation lab 2013. In
International Conference of the Cross-Language Evaluation Forum for European
Languages. 2013;212-231.
Albright D, et al. Towards comprehensive syntactic and semantic annotations of the
clinical narrative. Journal of the American Medical Informatics Association. 2013;20:922-
930.
etc..
8AMIA 2018 | amia.org
Our overall method
1. MetaMap (Aronson et al. 2010) was used to map every mention of keywords
in a report to a unique concept ID in the Systematized Nomenclature of
Medicine Clinical Terms (SNOMED-CT)
2. Remove negative and equivocal findings within the radiology report
1. Tokenize
2. Parse
3. Apply rules
9
NE recognizer
(MetaMap)
Tokenize
(NLTK)
Apply rules
Dependency parse
(Bllip/Stanford)
Labels
AMIA 2018 | amia.org
Utilize the universal dependency graph to define patterns
• a directed graph
• vertices are words or phrases labeled with information such as part-of-
speech and the lemma
• edges represent typed dependencies from the governor to its dependent
and are labeled with dependency type
10
Negation and Uncertainty detection
AMIA 2018 | amia.org
Sample rules
11
• Defined rules on the dependency graphs by utilizing the dependency label
and direction information
AMIA 2018 | amia.org
Experiments
• Experiments on corpora with positive findings annotated
• OpenI: 3,851 reports, 1,354 findings
• Chest X-ray: 900 reports, 2131 findings
• Experiments on corpora with negative findings annotated
• BioScope: 977 reports, 466 negative scopes
• PK: 116 reports, 491 negative phrases
12AMIA 2018 | amia.org
Results
13
OpenI ChestX-ray
P R F P R F
MetaMap+NegEx 77.2 84.6 80.7 82.8 95.5 88.7
MetaMap+NegBio 89.8 85.0 87.3 94.4 94.4 94.4
AMIA 2018 | amia.org
BioScope PK
P R F P R F
NegEx 70.6 98.7 82.3 95.1 91.2 93.1
NegBio 96.1 95.7 95.9 98.4 88.6 93.3
NIH Chest X-ray Dataset
One of the largest publicly available chest x-ray datasets to scientific
community
• 112,120 frontal-view X-ray images
• 30,805 unique patients
14AMIA 2017 | amia.org
https://nihcc.app.box.com/v/ChestXray-NIHCC
NegBio is an open source tool
15
https://github.com/ncbi-nlp/NegBio
AMIA 2018 | amia.org
Overview
Mining image labels via NLP for multi-label pathology classification
16AMIA 2018 | amia.org
One of ImageNet pre-trained models
GoogLeNet ResNetVggNetAlexNet
Weights from
predication layer
Pooling Layer
NE
recognizer
(MetaMap)
Negative/
Equivocal
detection
Labels
Image
data
conv1/7x7_s2
conv1/rule_7x7
inception_5b/
output
data
conv1
res5c
res5c_relu
data
conv1_1
relu1_1
conv5_3
relu5_3
data
conv1
conv1
conv5
relu5
MAX LSE AVE
Transition
Layer
Multi-label Classification and Localization
Wang X, Peng Y, Lu L, Bagheri M, Lu Z, Summers
R. ChestX-ray8: Hospital-scale Chest X-ray database
and benchmarks on weakly-supervised classification
and localization of common thorax diseases. IEEE
Conference on Computer Vision and Pattern
Recognition (CVPR). 2017, 2097-2106.
17AMIA 2018 | amia.org
Wang X*, Peng Y*, Lu L, Lu Z, Summers
R. TieNet: Text-Image Embedding Network for
Common Thorax Disease Classification and
Reporting in Chest X-rays. IEEE Conference on
Computer Vision and Pattern Recognition
(CVPR). 2018.
Conclusion and Future work
• We propose an algorithm, NegBio, to determine negative and uncertain
findings in radiology reports.
• We evaluated NegBio on three publicly available corpora and a newly
constructed corpus. We showed that NegBio achieved a significant
improvement on all datasets over the state of the art.
• We made NegBio an open source tool.
Future work
• To explore NegBio’s applicability in clinical texts beyond radiology reports.
18AMIA 2017 | amia.org
Acknowledgment
This work was supported by the Intramural Research Programs of the National
Institutes of Health, National Library of Medicine and Clinical Center.
The authors of NegEx and MetaMap for making their software tools publicly
available.
Drs. Dina Demner-Fushman and Willie J Rogers for the helpful discussion.
19AMIA 2017 | amia.org
Thank you!
yifan.peng@nih.gov
https://github.com/ncbi-nlp/NegBio

More Related Content

What's hot

Informatics and Clinical Decision Support in Precision Medicine
Informatics and Clinical Decision Support in Precision MedicineInformatics and Clinical Decision Support in Precision Medicine
Informatics and Clinical Decision Support in Precision Medicine
Andre Dekker
 
Sappire slides
Sappire slidesSappire slides
Sappire slides
Hyeoneui Kim
 
openEHR: UK 100,000 Genomes project
openEHR: UK 100,000 Genomes projectopenEHR: UK 100,000 Genomes project
openEHR: UK 100,000 Genomes project
openEHR Foundation
 
operationalizing asthma analytic plan using omop cdm brandt
operationalizing asthma analytic plan using omop cdm brandtoperationalizing asthma analytic plan using omop cdm brandt
operationalizing asthma analytic plan using omop cdm brandt
Marion Sills
 
Data Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer DiagnosisData Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer Diagnosis
DataminingTools Inc
 
Big Data in Disease Management
Big Data in Disease ManagementBig Data in Disease Management
Big Data in Disease Management
InterpretOmics
 
Ai Application in Life Sciences
Ai Application in Life SciencesAi Application in Life Sciences
Ai Application in Life Sciences
amelieparker
 
The impact of different sources of heterogeneity on loss of accuracy from gen...
The impact of different sources of heterogeneity on loss of accuracy from gen...The impact of different sources of heterogeneity on loss of accuracy from gen...
The impact of different sources of heterogeneity on loss of accuracy from gen...
Levi Waldron
 
ENAR 2020
ENAR 2020ENAR 2020
ENAR 2020
Warren Kibbe
 
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
Joel Saltz
 
Diagnostic criteria and clinical guidelines standardization to automate case ...
Diagnostic criteria and clinical guidelines standardization to automate case ...Diagnostic criteria and clinical guidelines standardization to automate case ...
Diagnostic criteria and clinical guidelines standardization to automate case ...
Melanie Courtot
 
Organ Specific Proteomics
Organ Specific ProteomicsOrgan Specific Proteomics
Big Data Challenges for Real-Time Personalized Medicine
Big Data Challenges for Real-Time Personalized MedicineBig Data Challenges for Real-Time Personalized Medicine
Big Data Challenges for Real-Time Personalized Medicine
SAP Technology
 
The Future of Personalized Medicine
The Future of Personalized MedicineThe Future of Personalized Medicine
The Future of Personalized Medicine
Edgewater
 
Chapter 31
Chapter 31Chapter 31
Chapter 31
Atehnkeng lawrence
 
Convolutional capsule network for covid 19 detection
Convolutional capsule network for covid 19 detectionConvolutional capsule network for covid 19 detection
Convolutional capsule network for covid 19 detection
Shamik Tiwari
 
AI applications in life sciences - drug development
AI applications in life sciences - drug developmentAI applications in life sciences - drug development
AI applications in life sciences - drug development
Jayanthi Repalli, PhD
 
ARTIFICIAL INTELLIGENCE IN DRUG DISCOVERY "AN OVERVIEW OF AWARENESS"
ARTIFICIAL INTELLIGENCE IN DRUG DISCOVERY  "AN OVERVIEW OF AWARENESS"ARTIFICIAL INTELLIGENCE IN DRUG DISCOVERY  "AN OVERVIEW OF AWARENESS"
ARTIFICIAL INTELLIGENCE IN DRUG DISCOVERY "AN OVERVIEW OF AWARENESS"
FinianCN
 
Publications_list_2015
Publications_list_2015Publications_list_2015
Publications_list_2015
Dr.Fatma Taher
 
PR-246: A deep learning system for differential diagnosis of skin diseases
PR-246: A deep learning system for differential diagnosis of skin diseasesPR-246: A deep learning system for differential diagnosis of skin diseases
PR-246: A deep learning system for differential diagnosis of skin diseases
Sunghoon Joo
 

What's hot (20)

Informatics and Clinical Decision Support in Precision Medicine
Informatics and Clinical Decision Support in Precision MedicineInformatics and Clinical Decision Support in Precision Medicine
Informatics and Clinical Decision Support in Precision Medicine
 
Sappire slides
Sappire slidesSappire slides
Sappire slides
 
openEHR: UK 100,000 Genomes project
openEHR: UK 100,000 Genomes projectopenEHR: UK 100,000 Genomes project
openEHR: UK 100,000 Genomes project
 
operationalizing asthma analytic plan using omop cdm brandt
operationalizing asthma analytic plan using omop cdm brandtoperationalizing asthma analytic plan using omop cdm brandt
operationalizing asthma analytic plan using omop cdm brandt
 
Data Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer DiagnosisData Mining Techniques In Computer Aided Cancer Diagnosis
Data Mining Techniques In Computer Aided Cancer Diagnosis
 
Big Data in Disease Management
Big Data in Disease ManagementBig Data in Disease Management
Big Data in Disease Management
 
Ai Application in Life Sciences
Ai Application in Life SciencesAi Application in Life Sciences
Ai Application in Life Sciences
 
The impact of different sources of heterogeneity on loss of accuracy from gen...
The impact of different sources of heterogeneity on loss of accuracy from gen...The impact of different sources of heterogeneity on loss of accuracy from gen...
The impact of different sources of heterogeneity on loss of accuracy from gen...
 
ENAR 2020
ENAR 2020ENAR 2020
ENAR 2020
 
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
 
Diagnostic criteria and clinical guidelines standardization to automate case ...
Diagnostic criteria and clinical guidelines standardization to automate case ...Diagnostic criteria and clinical guidelines standardization to automate case ...
Diagnostic criteria and clinical guidelines standardization to automate case ...
 
Organ Specific Proteomics
Organ Specific ProteomicsOrgan Specific Proteomics
Organ Specific Proteomics
 
Big Data Challenges for Real-Time Personalized Medicine
Big Data Challenges for Real-Time Personalized MedicineBig Data Challenges for Real-Time Personalized Medicine
Big Data Challenges for Real-Time Personalized Medicine
 
The Future of Personalized Medicine
The Future of Personalized MedicineThe Future of Personalized Medicine
The Future of Personalized Medicine
 
Chapter 31
Chapter 31Chapter 31
Chapter 31
 
Convolutional capsule network for covid 19 detection
Convolutional capsule network for covid 19 detectionConvolutional capsule network for covid 19 detection
Convolutional capsule network for covid 19 detection
 
AI applications in life sciences - drug development
AI applications in life sciences - drug developmentAI applications in life sciences - drug development
AI applications in life sciences - drug development
 
ARTIFICIAL INTELLIGENCE IN DRUG DISCOVERY "AN OVERVIEW OF AWARENESS"
ARTIFICIAL INTELLIGENCE IN DRUG DISCOVERY  "AN OVERVIEW OF AWARENESS"ARTIFICIAL INTELLIGENCE IN DRUG DISCOVERY  "AN OVERVIEW OF AWARENESS"
ARTIFICIAL INTELLIGENCE IN DRUG DISCOVERY "AN OVERVIEW OF AWARENESS"
 
Publications_list_2015
Publications_list_2015Publications_list_2015
Publications_list_2015
 
PR-246: A deep learning system for differential diagnosis of skin diseases
PR-246: A deep learning system for differential diagnosis of skin diseasesPR-246: A deep learning system for differential diagnosis of skin diseases
PR-246: A deep learning system for differential diagnosis of skin diseases
 

Similar to NegBio: a high-performance tool for negation and uncertainty detection in radiology reports

Professor Harrison Bai, Artificial Intelligence Applications in Radiology_mHe...
Professor Harrison Bai, Artificial Intelligence Applications in Radiology_mHe...Professor Harrison Bai, Artificial Intelligence Applications in Radiology_mHe...
Professor Harrison Bai, Artificial Intelligence Applications in Radiology_mHe...
Levi Shapiro
 
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
Kaela Johnson
 
Using NLP and curation to make clinical data available for research
Using NLP and curation to make clinical data available for researchUsing NLP and curation to make clinical data available for research
Using NLP and curation to make clinical data available for research
Warren Kibbe
 
SCOPE Summit - Applying the OMOP data model & OHDSI software to national Euro...
SCOPE Summit - Applying the OMOP data model & OHDSI software to national Euro...SCOPE Summit - Applying the OMOP data model & OHDSI software to national Euro...
SCOPE Summit - Applying the OMOP data model & OHDSI software to national Euro...
Kees van Bochove
 
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
Mark Gusack
 
Data Science in Healthcare -The University Malaya Medical Centre Breast Cance...
Data Science in Healthcare -The University Malaya Medical Centre Breast Cance...Data Science in Healthcare -The University Malaya Medical Centre Breast Cance...
Data Science in Healthcare -The University Malaya Medical Centre Breast Cance...
University of Malaya
 
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...
Damian R. Mingle, MBA
 
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
Servio Fernando Lima Reina
 
REVIEW 1 - PROJECT PPT TEMPLATE (4) (3).pptx
REVIEW 1 - PROJECT PPT TEMPLATE (4) (3).pptxREVIEW 1 - PROJECT PPT TEMPLATE (4) (3).pptx
REVIEW 1 - PROJECT PPT TEMPLATE (4) (3).pptx
sathiyasowmi
 
Electronic health records and machine learning
Electronic health records and machine learningElectronic health records and machine learning
Electronic health records and machine learning
Eman Abdelrazik
 
Machine learning in biology
Machine learning in biologyMachine learning in biology
Machine learning in biology
Pranavathiyani G
 
The Envisia Genomic Classifier
The Envisia Genomic ClassifierThe Envisia Genomic Classifier
The Envisia Genomic Classifier
Phil J. Morrison
 
Zhe_2014JointSummits_v6
Zhe_2014JointSummits_v6Zhe_2014JointSummits_v6
Zhe_2014JointSummits_v6
Zhe (Henry) He
 
X-Ray Disease Identifier
X-Ray Disease IdentifierX-Ray Disease Identifier
X-Ray Disease Identifier
IRJET Journal
 
Madhavi
MadhaviMadhavi
Could this change how radiology residents record their clinical output?
Could this change how radiology residents record their clinical output?Could this change how radiology residents record their clinical output?
Could this change how radiology residents record their clinical output?
Apparao Mukkamala
 
Radiomics: Novel Paradigm of Deep Learning for Clinical Decision Support towa...
Radiomics: Novel Paradigm of Deep Learning for Clinical Decision Support towa...Radiomics: Novel Paradigm of Deep Learning for Clinical Decision Support towa...
Radiomics: Novel Paradigm of Deep Learning for Clinical Decision Support towa...
Wookjin Choi
 
A SURVEY ON BLOOD DISEASE DETECTION USING MACHINE LEARNING
A SURVEY ON BLOOD DISEASE DETECTION USING MACHINE LEARNINGA SURVEY ON BLOOD DISEASE DETECTION USING MACHINE LEARNING
A SURVEY ON BLOOD DISEASE DETECTION USING MACHINE LEARNING
IRJET Journal
 
Lung Nodule Feature Extraction and Classification using Improved Neural Netwo...
Lung Nodule Feature Extraction and Classification using Improved Neural Netwo...Lung Nodule Feature Extraction and Classification using Improved Neural Netwo...
Lung Nodule Feature Extraction and Classification using Improved Neural Netwo...
IRJET Journal
 
IRJET- Oral Cancer Detection using Machine Learning
IRJET- Oral Cancer Detection using Machine LearningIRJET- Oral Cancer Detection using Machine Learning
IRJET- Oral Cancer Detection using Machine Learning
IRJET Journal
 

Similar to NegBio: a high-performance tool for negation and uncertainty detection in radiology reports (20)

Professor Harrison Bai, Artificial Intelligence Applications in Radiology_mHe...
Professor Harrison Bai, Artificial Intelligence Applications in Radiology_mHe...Professor Harrison Bai, Artificial Intelligence Applications in Radiology_mHe...
Professor Harrison Bai, Artificial Intelligence Applications in Radiology_mHe...
 
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
Automated Generation Of Synoptic Reports From Narrative Pathology Reports In ...
 
Using NLP and curation to make clinical data available for research
Using NLP and curation to make clinical data available for researchUsing NLP and curation to make clinical data available for research
Using NLP and curation to make clinical data available for research
 
SCOPE Summit - Applying the OMOP data model & OHDSI software to national Euro...
SCOPE Summit - Applying the OMOP data model & OHDSI software to national Euro...SCOPE Summit - Applying the OMOP data model & OHDSI software to national Euro...
SCOPE Summit - Applying the OMOP data model & OHDSI software to national Euro...
 
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
 
Data Science in Healthcare -The University Malaya Medical Centre Breast Cance...
Data Science in Healthcare -The University Malaya Medical Centre Breast Cance...Data Science in Healthcare -The University Malaya Medical Centre Breast Cance...
Data Science in Healthcare -The University Malaya Medical Centre Breast Cance...
 
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...
 
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
 
REVIEW 1 - PROJECT PPT TEMPLATE (4) (3).pptx
REVIEW 1 - PROJECT PPT TEMPLATE (4) (3).pptxREVIEW 1 - PROJECT PPT TEMPLATE (4) (3).pptx
REVIEW 1 - PROJECT PPT TEMPLATE (4) (3).pptx
 
Electronic health records and machine learning
Electronic health records and machine learningElectronic health records and machine learning
Electronic health records and machine learning
 
Machine learning in biology
Machine learning in biologyMachine learning in biology
Machine learning in biology
 
The Envisia Genomic Classifier
The Envisia Genomic ClassifierThe Envisia Genomic Classifier
The Envisia Genomic Classifier
 
Zhe_2014JointSummits_v6
Zhe_2014JointSummits_v6Zhe_2014JointSummits_v6
Zhe_2014JointSummits_v6
 
X-Ray Disease Identifier
X-Ray Disease IdentifierX-Ray Disease Identifier
X-Ray Disease Identifier
 
Madhavi
MadhaviMadhavi
Madhavi
 
Could this change how radiology residents record their clinical output?
Could this change how radiology residents record their clinical output?Could this change how radiology residents record their clinical output?
Could this change how radiology residents record their clinical output?
 
Radiomics: Novel Paradigm of Deep Learning for Clinical Decision Support towa...
Radiomics: Novel Paradigm of Deep Learning for Clinical Decision Support towa...Radiomics: Novel Paradigm of Deep Learning for Clinical Decision Support towa...
Radiomics: Novel Paradigm of Deep Learning for Clinical Decision Support towa...
 
A SURVEY ON BLOOD DISEASE DETECTION USING MACHINE LEARNING
A SURVEY ON BLOOD DISEASE DETECTION USING MACHINE LEARNINGA SURVEY ON BLOOD DISEASE DETECTION USING MACHINE LEARNING
A SURVEY ON BLOOD DISEASE DETECTION USING MACHINE LEARNING
 
Lung Nodule Feature Extraction and Classification using Improved Neural Netwo...
Lung Nodule Feature Extraction and Classification using Improved Neural Netwo...Lung Nodule Feature Extraction and Classification using Improved Neural Netwo...
Lung Nodule Feature Extraction and Classification using Improved Neural Netwo...
 
IRJET- Oral Cancer Detection using Machine Learning
IRJET- Oral Cancer Detection using Machine LearningIRJET- Oral Cancer Detection using Machine Learning
IRJET- Oral Cancer Detection using Machine Learning
 

Recently uploaded

Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
RadiNasr
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
co23btech11018
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
gerogepatton
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
171ticu
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptxML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
JamalHussainArman
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
zubairahmad848137
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
sachin chaurasia
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
mamunhossenbd75
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
ihlasbinance2003
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 

Recently uploaded (20)

Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
Computational Engineering IITH Presentation
Computational Engineering IITH PresentationComputational Engineering IITH Presentation
Computational Engineering IITH Presentation
 
International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...International Conference on NLP, Artificial Intelligence, Machine Learning an...
International Conference on NLP, Artificial Intelligence, Machine Learning an...
 
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样官方认证美国密歇根州立大学毕业证学位证书原版一模一样
官方认证美国密歇根州立大学毕业证学位证书原版一模一样
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptxML Based Model for NIDS MSc Updated Presentation.v2.pptx
ML Based Model for NIDS MSc Updated Presentation.v2.pptx
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
 
The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.The Python for beginners. This is an advance computer language.
The Python for beginners. This is an advance computer language.
 
Heat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation pptHeat Resistant Concrete Presentation ppt
Heat Resistant Concrete Presentation ppt
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 

NegBio: a high-performance tool for negation and uncertainty detection in radiology reports

  • 1. Yifan Peng1, Xiaosong Wang2, Le Lu2, Mohammadhadi Bagheri2, Ronald Summers2, Zhiyong Lu1 1 National Center for Biotechnology Information, NLM, NIH 2 Clinical Center, NIH Twitter: #AMIA2018 NegBio: a high-performance tool for negation and uncertainty detection in radiology reports Oral Presentations – Imaging, S41
  • 2. • The availability of well-labeled data is the key for large scale machine learning, e.g. deep learning • Hospitals have accumulated a large number of raw radiology images and reports • Conventional ways for collecting image labels are NOT applicable • the security and privacy issues • requires comprehension of domain-specific medical knowledge All Start with Data Large scale natural image datasets Large scale Medical Image dataset 2AMIA 2018 | amia.org
  • 3. Overview Mining image labels via NLP for multi-label pathology classification 3AMIA 2018 | amia.org One of ImageNet pre-trained models GoogLeNet ResNetVggNetAlexNet Weights from predication layer Pooling Layer NE recognizer (MetaMap) Negative/ Equivocal detection Labels Image data conv1/7x7_s2 conv1/rule_7x7 inception_5b/ output data conv1 res5c res5c_relu data conv1_1 relu1_1 conv5_3 relu5_3 data conv1 conv1 conv5 relu5 MAX LSE AVE Transition Layer
  • 4. A Sample Entry Image Report Label Findings: pa and lateral views of the chest demonstrate significantly improved bilateral lower lung field interstitial markings compatible with linear atelectasis. unchanged right 9th rib fracture peripherally. unchanged ossification left coracoacromial ligament. the cardiac and mediastinal contours are stable. Impression: improved bilateral lower lung field linear atelectasis. Atelectasis 4AMIA 2018 | amia.org
  • 5. 14 Common Thorax Diseases • Atelectasis • Cardiomegaly • Consolidation • Edema • Effusion • Emphysema • Fibrosis • Hernia • Infiltration • Mass • Nodule • Pleural Thickening • Pneumonia • Pneumothorax 5AMIA 2018 | amia.org
  • 6. Challenges Negative and equivocal findings may indicate the absence of findings mentioned within the radiology report Findings: right internal jugular catheter remains in place. Large metastatic lung mass in the lateral left upper lobe is again noted. No infiltrate or effusion. Extensive surgical clips again noted left axilla. Impression: no significant change. Reason for exam (entered by ordering clinician into cris): bilateral pneumonia no change in the tracheostomy tube or right internal jugular venous catheter. Unchanged bilateral alveolar infiltrates, fluid in the right minor fissure, lucency at the right costophrenic angle suggesting pneumonia. Overall, no significant change 6AMIA 2018 | amia.org
  • 7. Related Work Chapman W, et al. A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of Biomedical Informatics. 2001;34:301-310. Harkema H, et al. ConText: an algorithm for determining negation, experiencer, and temporal status from clinical reports. Journal of biomedical informatics. 2009;42:839-851. Mutalik P, et al. Use of general-purpose negation detection to augment concept indexing of medical documents: a quantitative study using the UMLS. Journal of the American Medical Informatics Association. 2001;8:598-609. Sohn S, Wu S, Chute C. Dependency parser-based negation detection in clinical narratives. In AMIA Summits on Translational Science proceedings AMIA Summit on Translational Science. 2012;2012:1-8. Mehrabi S, et al. DEEPEN: A negation detection system for clinical text incorporating dependency relation into NegEx. Journal of Biomedical Informatics. 2015;54:213-219. 7AMIA 2018 | amia.org
  • 8. Related Work Ogren P, et al. Constructing evaluation corpora for automated clinical named entity recognition. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08). 2008;28-30. Uzuner South B, et al. 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association. 2011;18:552-556. Suominen H, et al. Overview of the ShARe/CLEF eHealth evaluation lab 2013. In International Conference of the Cross-Language Evaluation Forum for European Languages. 2013;212-231. Albright D, et al. Towards comprehensive syntactic and semantic annotations of the clinical narrative. Journal of the American Medical Informatics Association. 2013;20:922- 930. etc.. 8AMIA 2018 | amia.org
  • 9. Our overall method 1. MetaMap (Aronson et al. 2010) was used to map every mention of keywords in a report to a unique concept ID in the Systematized Nomenclature of Medicine Clinical Terms (SNOMED-CT) 2. Remove negative and equivocal findings within the radiology report 1. Tokenize 2. Parse 3. Apply rules 9 NE recognizer (MetaMap) Tokenize (NLTK) Apply rules Dependency parse (Bllip/Stanford) Labels AMIA 2018 | amia.org
  • 10. Utilize the universal dependency graph to define patterns • a directed graph • vertices are words or phrases labeled with information such as part-of- speech and the lemma • edges represent typed dependencies from the governor to its dependent and are labeled with dependency type 10 Negation and Uncertainty detection AMIA 2018 | amia.org
  • 11. Sample rules 11 • Defined rules on the dependency graphs by utilizing the dependency label and direction information AMIA 2018 | amia.org
  • 12. Experiments • Experiments on corpora with positive findings annotated • OpenI: 3,851 reports, 1,354 findings • Chest X-ray: 900 reports, 2131 findings • Experiments on corpora with negative findings annotated • BioScope: 977 reports, 466 negative scopes • PK: 116 reports, 491 negative phrases 12AMIA 2018 | amia.org
  • 13. Results 13 OpenI ChestX-ray P R F P R F MetaMap+NegEx 77.2 84.6 80.7 82.8 95.5 88.7 MetaMap+NegBio 89.8 85.0 87.3 94.4 94.4 94.4 AMIA 2018 | amia.org BioScope PK P R F P R F NegEx 70.6 98.7 82.3 95.1 91.2 93.1 NegBio 96.1 95.7 95.9 98.4 88.6 93.3
  • 14. NIH Chest X-ray Dataset One of the largest publicly available chest x-ray datasets to scientific community • 112,120 frontal-view X-ray images • 30,805 unique patients 14AMIA 2017 | amia.org https://nihcc.app.box.com/v/ChestXray-NIHCC
  • 15. NegBio is an open source tool 15 https://github.com/ncbi-nlp/NegBio AMIA 2018 | amia.org
  • 16. Overview Mining image labels via NLP for multi-label pathology classification 16AMIA 2018 | amia.org One of ImageNet pre-trained models GoogLeNet ResNetVggNetAlexNet Weights from predication layer Pooling Layer NE recognizer (MetaMap) Negative/ Equivocal detection Labels Image data conv1/7x7_s2 conv1/rule_7x7 inception_5b/ output data conv1 res5c res5c_relu data conv1_1 relu1_1 conv5_3 relu5_3 data conv1 conv1 conv5 relu5 MAX LSE AVE Transition Layer
  • 17. Multi-label Classification and Localization Wang X, Peng Y, Lu L, Bagheri M, Lu Z, Summers R. ChestX-ray8: Hospital-scale Chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017, 2097-2106. 17AMIA 2018 | amia.org Wang X*, Peng Y*, Lu L, Lu Z, Summers R. TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2018.
  • 18. Conclusion and Future work • We propose an algorithm, NegBio, to determine negative and uncertain findings in radiology reports. • We evaluated NegBio on three publicly available corpora and a newly constructed corpus. We showed that NegBio achieved a significant improvement on all datasets over the state of the art. • We made NegBio an open source tool. Future work • To explore NegBio’s applicability in clinical texts beyond radiology reports. 18AMIA 2017 | amia.org
  • 19. Acknowledgment This work was supported by the Intramural Research Programs of the National Institutes of Health, National Library of Medicine and Clinical Center. The authors of NegEx and MetaMap for making their software tools publicly available. Drs. Dina Demner-Fushman and Willie J Rogers for the helpful discussion. 19AMIA 2017 | amia.org

Editor's Notes

  1. The motivation of this project is straightforward. In general computer vision, we have seen great use of neural network and deep learning techniques on different image processing tasks, such as image classification, object detection and caption generation. But we rarely see computer vision applications of deep learning in the clinical domain. The reason is probably we don’t have a large scale medical image dataset to fulfil the data-hungry DL needs. For natural image, we can use crowd-sourcing. But it is not applicable for X-ray images because the issues of security and privacy. Also, it usually requires domain knowledge to label the X-ray. Although hospitals have accumulated a large number of raw radiology images and reports. how we can generate labels for a large scale dataset remains challenging. In this project, we provide a text-mining method to automatically generate labels from radiology reports, and we show we can successfully train DL models using this dataset.
  2. The figure shows the overview of our approach. We have raw images and reports from Picture Archiving and Communication Systems. We mined the labels from the reports. We used the labeled images to train deep learning models for multi-label classification. In this talk, I will focus on the first step of how we constructed the labels.
  3. So the target of my side is to find diseases/findings from the clinical report
  4. Including atelectasis, we mainly focus on 14 diseases such as mass, nodule, and effusion. The14 finding types are most common in our institute, which are selected by radiologists from a clinical perspective.
  5. Different from other text, there are many negative or equivocal findings in the clinical text. For negative findings, we refer to findings that were ruled out by the radiologist such as no XXX. For equivocal findings, we refer to findings which radiologist is suspicious of. Such as “suggesting obstructive lung disease”. Since they may indicate the absence of findings mentioned within the radiology report, identifying them is as important as identifying positive findings. Otherwise, information extraction algorithms that do not distinguish negative and equivocal findings from positive ones may return many irrelevant results. Even though many natural language processing applications have been developed in recent years that successfully extract findings mentioned in medical reports, discriminating between positive, negative, and equivocal findings remains challenging
  6. We use a two-pass approach to achieve this. In the first pass, we use named-entity recognition tools to detect the findings from the report and normalized to a unique ID in SNOMED MetaMap is a knowledge-intensive rule-based approach to map biomedical text to the UMLS Metathesaurus DNorm is a machine learning method, developed by our group for disease recognition and normalization Then we remove negative and equivocal findings from the reports.
  7. The motivation of using dg is that we can use Less rules to capture more text variants
  8. Several rules that are frequently matched in the text
  9. To test the performance of NegBio Open I is one of the largest corpus where positive findings are annotated
  10. We can detect and remove more negative cases. As a result, the precision for positive finding detection increases.
  11. The NIH Clinical Center recently released over 100,000 anonymized chest x-ray images and their corresponding data to the scientific community. We hope The release will allow researchers across the country and around the world to freely access the datasets and increase their ability to teach computers how to detect and diagnose disease.
  12. The NIH Clinical Center recently released over 100,000 anonymized chest x-ray images and their corresponding data to the scientific community. We hope The release will allow researchers across the country and around the world to freely access the datasets and increase their ability to teach computers how to detect and diagnose disease.
  13. The figure shows the overview of our approach. We have raw images and reports from Picture Archiving and Communication Systems. We mined the labels from the reports. We used the labeled images to train deep learning models for multi-label classification. In this talk, I will focus on the first step of how we constructed the labels.
  14. We hope could be a baseline We hope The release will increase their ability to teach computers how to detect and diagnose disease. allow researchers across the country and around the world to freely access the datasets