Representation Learning in Medical Documents

Computer as a Doctor?
Representation Learning in
Medical Documents
Irene Li1 and Mark Hughes2
1Dublin Institute Technology, Ireland
2IBM Watson Health, Ireland

▪ Medicare Domain Dataset: limited, costy
▪Domain Experts: dependency
▪ Application Requirements (Use Case next page):
•Predictions
•Classification
•Summarization
Motivation

Use case: Sentence-Level Note Classification
( A 75-y-o woman) with sudden onset back pain last
night while lifting turkey from oven. The pain is worse
with movement or deep breath, better with rest. No
symptoms in legs, no fever or chills. No chest pain,
cough, wheezing, abdominal pain, headache…
Married. Two children. No smoking.
Sentence Level Categorization
Watson Smart Notes
Free-written
texts/chats:
Various Topics
Messy
Irrelevant

▪ Under the head of “Deep Learning” or “Feature Learning”
•DL algorithms attempt to learn more complex features:
multiple levels of representation
▪ Why?
•Get rid of “hand-designed” features and representations.
•Unsupervised feature learning.
•Everything into the same space.
Example: Lengths of sentences.
Representation Learning
Representation Learning Tutorial, Yoshua Bengio, 2012 http://www.iro.umontreal.ca/~bengioy/talks/icml2012-YB-tutorial.pdf

Distributed Representations for words:
•Word2vec[1]: neural word embeddings
(Each word is a vector)
•Doc2vec[2,3]: neural
document/paragraph/sentence
embeddings
(Each sentence is a vector)
Related Work: RL in NLP
[1] Distributed Representations of Words and Phrases and their Compositionality, Mikolov et.al. 2013
[2] Distributed Representations of Sentences and Documents, Quoc V.Le et.al. 2014
[3] Gensim: https://radimrehurek.com/gensim/models/doc2vec.html

Word Clusters:
Captures
Semantic
Meanings
Visualization using t-SNE.

Document Clusters
Visualization using t-SNE.
Picture from Dai, Andrew M., Christopher Olah, and Quoc V. Le. "Document embedding with paragraph vectors." (2015).
● 4,490,000 Wikipedia
English articles
● 915,715 unique words

Approach (1): Sentence to Image
Sentence
Conducted
to
examine
different
features
associated
with
NPEV
...
Word
Embeddings
2-D
Image

Approach (2): Model
Conv Layers: 64 filters; 5x5
Pooling Layers: 2x2;
Hidden Layer: 128 units
Output: 13 units

Corpus:
•3879 publications from PubMed[1]
•27.4 millions raw words
•181550 words in vocabulary
•13 classes by topic/journal
Results (1) : Dataset
[1]: US National Library of Medicine National Institutes of Health Search database http://www.ncbi.nlm.nih.gov/pubmed

27.4 million word occurrence distribution

Plot by https://tagul.com/cloud/2
13 classes by topic/journal

Results (2) : R-Square Scores in Classification
100-d

▪CNNs: ability to learn distributed representations.
▪ Pre-processing (stop-words, stemming, etc):
Accuracy drops: lose information.
Example: “studying”, “studies”-> “studi”
▪Training set:
•Arbitrarily chosen by journals: overlaps
•Noisy contents: irrelevant sentences
Example: “We examined a patient who had salad...”
•No “the best case”/ baselines for the system
Discussions

▪ Dataset
•In-domain knowledge: papers, books, etc
•For specific tasks: well-labeled
▪Representation
•CNN model: more complex (layers)
•Other models: Long-short Term Memory(LSTM), etc
▪Potential Applications
•Notes classification
•Patient2vec (Use Case next page): representation
learning on individual patient
Future Works

Patient2Vec:
Every patient is a vector
Feature extraction from everything:
gender,age, body conditions, history
treatments, …

Special thanks to Spyros Kotoulas1 and Toyotaro Suzumura2 for support and help.
1IBM Watson Health, Dublin, Ireland
2IBM T.J. Watson Research Center, New York, USA
Thanks!
Q&A
ireneli.eu

Representation Learning in Medical Documents

Recommended

Recommended

More Related Content

Similar to Representation Learning in Medical Documents

Similar to Representation Learning in Medical Documents (17)

Recently uploaded

Recently uploaded (20)

Representation Learning in Medical Documents