Leveraging Text Classification Strategies for Clinical and Public Health Applications

Leveraging Text
Classification Strategies
for Clinical and Public
Health Applications
Karin M. Verspoor
@karinv
karin.verspoor@unimelb.edu.au
The University of Melbourne
Melbourne, Victoria, Australia
January 2016, Qatar Computing Research Institute

(clinical) Data everywhere
• Electronic health records
– Patient demographics and biometrics
– Laboratory test results
– Clinical notes
• Radiology and pathology
– Images: X-ray, MRI and PET Scans
– (Synoptic) Reports
• Databases
– Health Service reporting
– National Prescribing Service
– Registry data, Births and Deaths
– Medicare/insurance claim data etc…..

Don’t forget unstructured data!
• About 80% of clinical information is in textual form
– ED triage notes
– Clinical progress notes
– Radiology and Pathology reports
– GP and specialist letters
– Discharge summaries
• Published Literature
– Clinical Trials
– Molecular-level studies
• and … social media text!

How is text used in medicine?
• Direct analysis of clinical records
– Information retrieval for clinical trials
– Syndromic surveillance
– Hospital Services Research
– Clinical Decision Support
– Pharmacovigilance
• Literature mining
– Evidence-based medicine
– Systematic Reviews

Evidence from EHRs
Mining electronic health records: towards better research applications and clinical care
Peter B. Jensen, Lars J. Jensen & Søren Brunak, Nature Reviews Genetics 13, 395-405 (2012)
doi:10.1038/nrg3208

Pharmacovigilance from EHRs
Mining of clinical records to identify adverse drug events
Estimated >90% of adverse events do not appear in coded data
6
LePendu et al. (2013) “Pharmacovigilance Using Clinical Notes” Clinical Pharmacology &
Therapeutics 93(6), 547–555; doi: 10.1038/clpt.2013.47

… from social media
Pacific Symposium on Biocomputing Shared Task on
Social Media Mining
Classification of tweets: mention an Adverse Drug Reaction?
ADR classified
@NAME Q makes me hungry.
Olanzapine made me want to
eat my own arm!
Non-ADR classified
I couldnt be a chef without
nicotine and caffeine

Outline
Problem setting
Approach and Results
EHR disease classification
Kocbek et al (2015) Evaluating classification power of linked admission data
sources with text mining; Proceedings of the Scientific Stream at Big Data in Health
Analytics 2015 (BigData 2015).

ICD classification of EHR data
• We address the task of detecting clinical records
in a large record system corresponding to a given
diagnosis of interest, based on text analysis
• We focus on lung cancer records for a pilot study
• We developed a system that classifies each
admission as positive or negative for lung cancer
• Not as simple as looking for “lung cancer” or
synonyms in the EHRs!
Kocbek et al (2015) HISA Big Data conference.
http://ceur-ws.org/Vol-1468/bd2015_kocbek.pdf

Alfred REASON platform
Kocbek et al. Big Data 2015, Sydney
• 15+ years of data from.
• 171,000+ updates each day.
• 62.4 million updates per annum.

Radiology question
50yo complaining of left shoulder
pain. Tender generally. Difficulty
abducting the shoulder past 45
degrees. Home on HITH
tomorrow - either inpatient or
outpatient please
Task
Radiology report
Mobile Chest performed on 02-JUN-2012
at 08:27 AM: The nasogastric tube has
its tip in the stomach. The tracheostomy is
seen at T2 level. ….
Pathology report
Urine Culture
Acc No: 12-183-0731Source: Urine
------------ URINE MICROSCOPY (PHASE
CONTRAST) ------------- Leucocytes
x10^6/L (Ref <10).... <10
Erythrocytes x10^6/L (Ref <10).. <10.......
Additional data
Age: 50
Date of admission: Jun/12
Gender: F
Country: …
Admission
ICD-10 code

Data Characteristics
• Extracted data for 2 financial years, 2012-2014:
– 150,521 admissions,
– 40,800 radiology reports with associated question,
– 20,872 pathology reports,
– 121,700 additional data entries (demographics, hospital
admission info).
• Admissions are associated to ICD-10 codes:
– Used as ground truth
– ICD-10 code C34.*; positive cases for lung cancer
– 496 such positive admissions
– an additional 496 non-lung cancer submissions
randomly subsampled as negatives

Outline
Problem setting
Approach and Results
EHR disease classification

Research Question
• Most previous TM applications use a single
textual data source from the EHR despite a
diversity of potential data
• What is the impact of using more than one textual
data source for the EHR classification task?
– Considering different text sources;
– and including patient (structured) meta-data?

Methods
Radiology
reports
Machine learning
algorithm (SVM)
Textual and
other features
Biomedical
knowledge
sources Language
processing
Classification
Model
Additional
data
Pathology
reports
Radiology
questions
REASON sources

Text Processing
• Medical terminology recognition and normalisation
using MetaMap
• NegEx to detect negation and negation scope
The nasogastric tube has its tip in the stomach.
Meta Candidates (Total=9; Excluded=0; Pruned=0; Remaining=9)
1000 C0085678:Nasogastric tube [Medical Device]
1000 C0812428:Nasogastric tube (Nasogastric tube procedures) [Therapeutic Procedure]
861 C0175730:Tube (biomedical tube device) [Medical Device]
861 C0694637:Nasogastric (Nasogastric Route of Drug Administration) [Functional Concept]
861 C1547937:Tube NOS (Specimen Source Codes - Tube) [Intellectual Product]
861 C1561954:tube [Conceptual Entity]
861 C1704730:TUBE (Packaging Tube) [Medical Device]
861 C1704731:Tube (Tube Device Component) [Medical Device]
861 C3282907:Nasogastric [Body Location or Region]
Meta Mapping (1000):
1000 C0085678:Nasogastric tube [Medical Device]

Features
Texts
• bag of (MetaMap) phrases
– separate feature for Positive/Negative context
– experimented with keeping phrases separated
according to source, or merging across sources
Patient meta-data
• demographic data (gender, age, ethnic origin, country,
language, marital status, religion, and death date)
• hospital-related admission data (hospital code,
admission date and time, discharge date and time, length
of stay, reason for admission, admission unit, discharge
unit, admission type, source, destination and criteria)

Experimental setting
• Heavily skewed data: undersampling of negatives
• 10-fold cross validation
• Support Vector Machine (Weka)

Results: Lung Cancer
0.873
0.901
0.870
0.885
0.900
0.915
0.930
1 2 3 4
radiology reports + 1 data source
(F-Score)
radiology question pathology report additional data

0.873
0.901
0.917
0.870
0.885
0.900
0.915
0.930
1 2 3 4
radiology reports + 2 additional data sources
(F-score)
radiology question pathology reports additional data

0.873
0.901
0.917
0.930
0.870
0.885
0.900
0.915
0.930
1 2 3 4
F-Score using 4 data sources
radiology question pathology reports additional data

Discussion
• More data sources lead to better performance
• The classifier with the highest performance was
built using features from all four data sources
• Merging sources into aggregate features better
• Not all improvements are significant:
– Radiology question and metadata add clear value
– Pathology reports does not
• Not all admissions had a pathology report associated with
them.

Case study 1: Conclusions
• We built a text mining system for detecting lung
cancer admissions using machine learning
methods.
• Our results show more effective systems can
generally be built by including multiple linked data
sources.
• Work in progress:
– Other diseases
– Imbalanced datasets
– Feature engineering
and selection
0.893
0.820
0.830
0.840
0.850
0.860
0.870
0.880
0.890
0.900
0.910
0.920
1 2 3 4
Breast cancer

Outline
DOD with Twitter
Emotion classification
DOD signal 1: Tweet emotion shift
DOD signal 2: Tweet lexical shift
Disease Outbreak Detection
Ofoghi et al (2016) Towards early discovery of salient health threats: A social media
emotion classification technique; Pacific Symposium on Biocomputing.

Twitter for Outbreak Detection
Assumptions
• People tweet about diseases in the context of
emerging outbreaks
• Twitter can provide an “early warning” of an
outbreak
“Tweets started to rise in Nigeria 3-7 days prior
to the official announcement of the first probable
Ebola case. The topics discussed in tweets include
risk factors, prevention education, disease
trends, and compassion.”
Amer J Infection Control (2015)

Twitter for Outbreak Detection
Strategy
• Trends: counting of (hashtag, term) frequencies
• Coupled with geographic origin of tweets
• Sentiment or content analyis
Challenges
• High volume of (mostly irrelevant) tweets
• Hashtags alone may not be adequate
• A mention of a disease does not necessarily
indicate an active case

DOD with Twitter | Previous Work
31

Is there a local emergent threat?
Can we use shifts in
emotional and lexical content of tweets
to detect a disease outbreak?

Ebola event/background data
Dataset Date (±7) pre-corpus post-corpus
#tweets |vocab| #tweets |vocab|
ebola-event-1 29-Dec-14 73 204 337 906
ebola-event-2 31-Jan-15 165 700 90 417
ebola-background 16-12-14 429 1453 340 1208

Outline
DOD with Twitter
Emotion classification
DOD signal 1: Tweet emotion shift
DOD signal 2: Tweet lexical shift

Emotion classes
• ECs: Ekman’s six basic emotions plus …
– News-related
– Criticism
– Sarcasm
https://www.behance.net/gallery/6-Basic-Emotions/930168
Sarcastic
atsign atsign think I got Ebola
there two minutes ago
News-related
atsign Another 4 American
Ebola workers flown back to
USA for monitoring..

Emotion classifier data
• Data: collection
– Twitter API
– Second half of March 2015
– Total of 12,101 tweets
– Contained “ebola” or “#ebola”
– 4,405 tweets remained after some filtering…
– Amazon’s Mechanical Turk was used to label tweets

Lexicon-Based Classification
• Created an emotion vocabulary
– Profile of Mood States (POMS)
– FrameNet
– Existing “feelings list”
– Wikipedia
• Vector space model
– Binary vector per emotion
– Binary vector per tweet
– Cosine Similarity emotion vs tweet
1
2
3 anxious
4
5
6
7 affronted
8
9
497
498
499 :-|
.
.
.
https://bitbucket.org/readbiomed/socialsurveillance

Emotion class distribution
Classes Dataset p-value
6 emotions ebola-event-1 0.004*
ebola-event-2 0.002*
ebola-backgr. 0.259
6 emotions + 3 add’l ebola-event-1 0.009*
ebola-event-2 0.007*
ebola-backgr. 0.079
paired t-test, pre- and post-event windows; * Statistically significant at 5% level

Jensen-Shannon divergence
Class ebola-
event-1
ebola-
event-2
ebola-bg.
Sarcasm 0.0227 0.0032 0.1365
News-rel. 0.0226 0.0001 0.0074
Anger 0.0572 0.0382 0.0169
Criticism 0.0180 0.0056 0.0060
Surprise 0.1161 0.0220 0.0023
Fear 0.0768 0.0813 0.0913
Happiness 0.0444 0.0415 0.0064
Disgust 0.0604 0.0025 0.0044
Sadness 0.0023 0.0322 0.0060
AVERAGE 0.0467 0.0252 0.0308
Big differences
compared with
background,
in both e1 and e2

Lexical shift analysis
Within-corpus analysis:
Cross-corpus analysis:

Case study 2: Conclusions
• We introduced an Ebola tweet-based emotion
classifier.
• There are statistically significant differences in the
distribution of emotion classes and lexical items in
tweets preceding and following a salient emergent
health threat.
• This effect does not occur in a neutral background
collection.
Proposal:
• Disease outbreak detection can be supported with
monitoring of tweets using a sliding window model
that tests for such distributional changes

Conclusions
• There are myriad problems in the clinical context
where unstructured data can be leveraged to
good effect
• Text classification is one tool that can be drawn
on to make use of this unstructured data
• Heterogeneous data integration is also important
• Challenges exist in
– Terminology
– Skewed data
– Missing data

Acknowledgements
• Amazon Mechanical Turkers
• James McCaw, Melbourne School of Population and
Global Health
Bahador Ofoghi Lawrence CavedonSimon Kocbek

ML-Based Classification
• MALLET Naïve Bayes
• Features
– bag of words[+lem,-lem]
– Lexicon-based similarity
– emotion vocabulary
– emoticons
– punctuation
– (Stanford) sentiment

KL-Divergence, full vocabulary

Emotion-level distribution
KL-divergence (pre- vs. post-event, post- vs. pre-)
P(x) and Q(x) represent probability of positive and negative emotion classes
in the respective corpora

Leveraging Text Classification Strategies for Clinical and Public Health Applications

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Leveraging Text Classification Strategies for Clinical and Public Health Applications

Similar to Leveraging Text Classification Strategies for Clinical and Public Health Applications (20)

More from Karin Verspoor

More from Karin Verspoor (7)

Recently uploaded

Recently uploaded (20)

Leveraging Text Classification Strategies for Clinical and Public Health Applications

Editor's Notes