Medical Text Classification using Convolutional Neural Network

•Download as PPTX, PDF•

4 likes•998 views

This document discusses using convolutional neural networks for medical text classification. It presents an approach using CNNs to classify sentences from clinical notes into categories. The model is trained on word embeddings from clinical papers and evaluated on labeled data from the Merck Manual. The CNN approach achieves better accuracy than baseline methods using doc2vec, mean word embeddings, and bag-of-words features with an SVM. Future work could include expanding the training data and applying the models to tasks like patient note classification and learning patient representations from their records.

Software

Medical Text Classification using
Convolutional Neural Networks
Mark Hughes, Irene Li , Spyros Kotoulas and Toyotaro Suzumura
26, April, 2017
Informatics for Health
IBM Research Ireland
Japan Science and Technology Agency, Tokyo, Japan
IBM TJ Watson Research Center, New York, USA

Motivation: Medical Text Classification
( A 75-y-o woman) with sudden onset back pain last
night while lifting turkey from oven. The pain is worse
with movement or deep breath, better with rest. No
symptoms in legs, no fever or chills. No chest pain,
cough, wheezing, abdominal pain, headache… Married.
Two children. No smoking.
Unstructural
clinical notes:
Various Topics
Messy
Irrelevant
IBM Watson Smart Notes Project
Search info related to particular illnesses
--- sentence-level classification

State-of-the-art Representation of NLP
[1] Distributed Representations of Words and Phrases and their Compositionality, Mikolov et.al. 2013
[2] Distributed Representations of Sentences and Documents, Quoc V.Le et.al. 2014
[3] Gensim: https://radimrehurek.com/gensim/models/doc2vec.html
[4] Dai, Andrew M., Christopher Olah, and Quoc V. Le. "Document embedding with paragraph vectors." (2015).
Distributed Representations: dense vectors
• Embedding Models: Word2vec[1] , Doc2vec[2,3]
• Visualization Example:
– Semantically clusterred
– Unsupervised learning
– Large training corpus

Convolutional Neural Network Modeling Sentences
Figure from Kim, YoonConvolutional neural networks for sentence classification." arXiv preprint arXiv:1408.5882 (2014).

Proposed Model: Convolutional Neural Network
features…

Datasets
[1]: US National Library of Medicine National Institutes of Health Search database http://www.ncbi.nlm.nih.gov/pubmed
[2]: Merck Manual Dataset http://www.merckmanuals.com/
Pre-trained Word2vec: 15,000 clinical research papers from PubMed[1].
Experiments: 26 Categories, 4000 sentences each, 1000 sentences validation
from Merck Manual[2].

Sentence embeddings + SVM
▪ Doc2vec, the distributed memory (PV-DM) model: represent each sentence
as a vector;
▪ Sentence vectors as inputs, supervised learning by SVM.
Mean Word embeddings + SVM
▪ Pair-wise mean sentence embeddings: each sentence is a vector, add zero
or eliminate if unseen;
▪ Sentence vectors as inputs, supervised learning by SVM.
Word embeddings with BOW(Bag-of-Word) Features
▪ K-means: word embeddings into 1000 clusters;
▪ BOW histogram: each sentence represented by a 1000-d vector;
▪ Sentence vectors as inputs, supervised learning by SVM.
Evaluation: Baselines

Conclusions & Discussions
Convolutional Neural Nets
• sentence-level classification in clinical domain;
• possible to be scaled up to paragraph/document level;
• the better ability to do classification compared with shallow
learning methods.
Representation Learning
• the ability to represent in a distributed way;
• pre-trained embeddings are useful for text
comparison/retrieval tasks.

Future Works
Dataset
• Extend in-domain knowledge: papers, books, relevant topics in
Wikipedia, etc;
• Test on fine graied set of clinical datasets.
Potential Applications
• Notes classification;
• Patient2vec (Use Case next page): representation learning on
individual patient, high level semantic representation of each
patient.

Patient2Vec: Every patient is a vector
Feature extraction from everything:
gender,age, body conditions, history
treatments, …

Thanks!
Q&A
Acknowledgement: This project is partially funded by CREST, Japan Science and
Technology Agency, Tokyo, Japan (Grant number : Number JPMJCR1303)

Similar to Medical Text Classification using Convolutional Neural Network

Paul GrothConnected Data World

An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...Artificial Intelligence Institute at UofSC

LSTM Based Sentiment Analysisijtsrd

A Survey Of Various Machine Learning Techniques For Text ClassificationJoshua Gorinson

"Analysis of Different Text Classification Algorithms: An Assessment "ijtsrd

Deep Neural Methods for RetrievalBhaskar Mitra

Ran zhou poster 2018Ran Zhou

Doc format.butest

Survey of natural language processing(midp2)Tariqul islam

Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease

Natural Language Processing, Techniques, Current Trends and Applications in I...RajkiranVeluri

How do we know what we don’t know: Using the Neuroscience Information Framew...Maryann Martone

Convolutional neural networks for sentiment classificationYunchao He

Continuous bag of words cbow word2vec word embedding work .pdfdevangmittal4

NLP Techniques for Text Classification.docxKevinSims18

Deep learning for nlpViet-Trung TRAN

Challenges in transfer learning in nlpLaraOlmosCamarena

Context Driven Technique for Document ClassificationIDES Editor

Simulation: From theory to implementationAdam Dubrowski

MS-Presentation-new template arid university.pptxNimraTariq69

Similar to Medical Text Classification using Convolutional Neural Network (20)

Paul Groth

An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...

LSTM Based Sentiment Analysis

A Survey Of Various Machine Learning Techniques For Text Classification

"Analysis of Different Text Classification Algorithms: An Assessment "

Deep Neural Methods for Retrieval

Ran zhou poster 2018

Doc format.

Survey of natural language processing(midp2)

Deep Learning for Information Retrieval: Models, Progress, & Opportunities

Natural Language Processing, Techniques, Current Trends and Applications in I...

How do we know what we don’t know: Using the Neuroscience Information Framew...

Convolutional neural networks for sentiment classification

Continuous bag of words cbow word2vec word embedding work .pdf

NLP Techniques for Text Classification.docx

Deep learning for nlp

Challenges in transfer learning in nlp

Context Driven Technique for Document Classification

Simulation: From theory to implementation

MS-Presentation-new template arid university.pptx

Recently uploaded

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp

Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions

ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin

Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy

The Evolution of Karaoke From Analog to App.pdfPower Karaoke

chapter--4-software-project-planning.pptkotipi9215

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

What is Fashion PLM and Why Do You Need ItWave PLM

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app

HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai

EY_Graph Database Powered SustainabilityNeo4j

Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH

Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171

why an Opensea Clone Script might be your perfect match.pdfjoe51371421

Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh

Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08

Recently uploaded (20)

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE

Advancing Engineering with AI through the Next Generation of Strategic Projec...

ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications

The Evolution of Karaoke From Analog to App.pdf

chapter--4-software-project-planning.ppt

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...

What is Fashion PLM and Why Do You Need It

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx

HR Software Buyers Guide in 2024 - HRSoftware.com

EY_Graph Database Powered Sustainability

Der Spagat zwischen BIAS und FAIRNESS (2024)

Cloud Management Software Platforms: OpenStack

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...

The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf

why an Opensea Clone Script might be your perfect match.pdf

Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝

Unit 1.1 Excite Part 1, class 9, cbse...

Medical Text Classification using Convolutional Neural Network

1. Medical Text Classification using Convolutional Neural Networks Mark Hughes, Irene Li , Spyros Kotoulas and Toyotaro Suzumura 26, April, 2017 Informatics for Health IBM Research Ireland Japan Science and Technology Agency, Tokyo, Japan IBM TJ Watson Research Center, New York, USA

2. Motivation: Medical Text Classification ( A 75-y-o woman) with sudden onset back pain last night while lifting turkey from oven. The pain is worse with movement or deep breath, better with rest. No symptoms in legs, no fever or chills. No chest pain, cough, wheezing, abdominal pain, headache… Married. Two children. No smoking. Unstructural clinical notes: Various Topics Messy Irrelevant IBM Watson Smart Notes Project Search info related to particular illnesses --- sentence-level classification

3. State-of-the-art Representation of NLP [1] Distributed Representations of Words and Phrases and their Compositionality, Mikolov et.al. 2013 [2] Distributed Representations of Sentences and Documents, Quoc V.Le et.al. 2014 [3] Gensim: https://radimrehurek.com/gensim/models/doc2vec.html [4] Dai, Andrew M., Christopher Olah, and Quoc V. Le. "Document embedding with paragraph vectors." (2015). Distributed Representations: dense vectors • Embedding Models: Word2vec[1] , Doc2vec[2,3] • Visualization Example: – Semantically clusterred – Unsupervised learning – Large training corpus

4. Convolutional Neural Network Modeling Sentences Figure from Kim, YoonConvolutional neural networks for sentence classification." arXiv preprint arXiv:1408.5882 (2014).

5. Proposed Model: Convolutional Neural Network features…

6. Datasets [1]: US National Library of Medicine National Institutes of Health Search database http://www.ncbi.nlm.nih.gov/pubmed [2]: Merck Manual Dataset http://www.merckmanuals.com/ Pre-trained Word2vec: 15,000 clinical research papers from PubMed[1]. Experiments: 26 Categories, 4000 sentences each, 1000 sentences validation from Merck Manual[2].

7. Sentence embeddings + SVM ▪ Doc2vec, the distributed memory (PV-DM) model: represent each sentence as a vector; ▪ Sentence vectors as inputs, supervised learning by SVM. Mean Word embeddings + SVM ▪ Pair-wise mean sentence embeddings: each sentence is a vector, add zero or eliminate if unseen; ▪ Sentence vectors as inputs, supervised learning by SVM. Word embeddings with BOW(Bag-of-Word) Features ▪ K-means: word embeddings into 1000 clusters; ▪ BOW histogram: each sentence represented by a 1000-d vector; ▪ Sentence vectors as inputs, supervised learning by SVM. Evaluation: Baselines

8. Results: Accuracy

9. Conclusions & Discussions Convolutional Neural Nets • sentence-level classification in clinical domain; • possible to be scaled up to paragraph/document level; • the better ability to do classification compared with shallow learning methods. Representation Learning • the ability to represent in a distributed way; • pre-trained embeddings are useful for text comparison/retrieval tasks.

10. Future Works Dataset • Extend in-domain knowledge: papers, books, relevant topics in Wikipedia, etc; • Test on fine graied set of clinical datasets. Potential Applications • Notes classification; • Patient2vec (Use Case next page): representation learning on individual patient, high level semantic representation of each patient.

11. Patient2Vec: Every patient is a vector Feature extraction from everything: gender,age, body conditions, history treatments, …

12. Thanks! Q&A Acknowledgement: This project is partially funded by CREST, Japan Science and Technology Agency, Tokyo, Japan (Grant number : Number JPMJCR1303)

Medical Text Classification using Convolutional Neural Network

Recommended

Recommended

More Related Content

Similar to Medical Text Classification using Convolutional Neural Network

Similar to Medical Text Classification using Convolutional Neural Network (20)

Recently uploaded

Recently uploaded (20)

Medical Text Classification using Convolutional Neural Network