SlideShare a Scribd company logo
Computer	as	a	Doctor?	
Representation	Learning	in	
Medical	Documents
Irene	Li1 and	Mark	Hughes2
1Dublin	Institute	Technology,	Ireland
2IBM	Watson	Health,	Ireland
▪ Medicare	Domain	Dataset:	limited,	costy
▪Domain	Experts:	dependency
▪ Application	Requirements	(Use	Case	next	page):	
•Predictions
•Classification
•Summarization
Motivation
Use	case:	Sentence-Level	Note	Classification
( A 75-y-o woman) with sudden onset back pain last
night while lifting turkey from oven. The pain is worse
with movement or deep breath, better with rest. No
symptoms in legs, no fever or chills. No chest pain,
cough, wheezing, abdominal pain, headache…
Married. Two children. No smoking.
Sentence Level Categorization
Watson Smart Notes
Free-written
texts/chats:
Various Topics
Messy
Irrelevant
▪ Under	the	head	of	“Deep	Learning”	or	“Feature	Learning”
•DL	algorithms	attempt	to	learn	more	complex	features:	
multiple	levels	of	representation
▪ Why?
•Get	rid	of	“hand-designed”	features	and	representations.
•Unsupervised	feature	learning.
•Everything	into	the	same	space.
Example:	Lengths	of	sentences.	
Representation	Learning
Representation Learning Tutorial, Yoshua Bengio, 2012 http://www.iro.umontreal.ca/~bengioy/talks/icml2012-YB-tutorial.pdf
▪ Under	the	head	of	“Deep	Learning”	or	“Feature	Learning”
•DL	algorithms	attempt	to	learn	more	complex	features:	
multiple	levels	of	representation
▪ Why?
•Get	rid	of	“hand-designed”	features	and	representations.
•Unsupervised	feature	learning.
•Everything	into	the	same	space.
Example:	Lengths	of	sentences.	
Representation	Learning
Representation Learning Tutorial, Yoshua Bengio, 2012 http://www.iro.umontreal.ca/~bengioy/talks/icml2012-YB-tutorial.pdf
Distributed	Representations	for	words:
•Word2vec[1]:	neural	word	embeddings
(Each	word	is	a	vector)
•Doc2vec[2,3]:	neural	
document/paragraph/sentence	
embeddings
(Each	sentence	is	a	vector)	
Related	Work:	RL	in	NLP
[1] Distributed Representations of Words and Phrases and their Compositionality, Mikolov et.al. 2013
[2] Distributed Representations of Sentences and Documents, Quoc V.Le et.al. 2014
[3] Gensim: https://radimrehurek.com/gensim/models/doc2vec.html
Word	Clusters:
Captures	
Semantic	
Meanings
Visualization using t-SNE.
Visualization using t-SNE.
Document	Clusters
Visualization using t-SNE.
Picture from Dai, Andrew M., Christopher Olah, and Quoc V. Le. "Document embedding with paragraph vectors." (2015).
● 4,490,000 Wikipedia
English articles
● 915,715 unique words
Approach	(1):	Sentence	to	Image	
Sentence
Conducted
to
examine
different
features
associated
with
NPEV
...
Word
Embeddings
2-D
Image
Approach	(2):	Model
Conv Layers: 64 filters; 5x5
Pooling Layers: 2x2;
Hidden Layer: 128 units
Output: 13 units
Corpus:	
•3879 publications	from	PubMed[1]
•27.4	millions raw	words	
•181550	words	in	vocabulary
•13 classes	by	topic/journal
Results	(1)	:	Dataset
[1]: US National Library of Medicine National Institutes of Health Search database http://www.ncbi.nlm.nih.gov/pubmed
27.4	million word	occurrence	distribution
Results	(1)	:	Dataset
Results	(1)	:	Dataset
Plot by https://tagul.com/cloud/2
13	classes	by	topic/journal
Results	(2)	:	R-Square	Scores	in	Classification
100-d
▪CNNs:	ability	to	learn	distributed	representations.
▪ Pre-processing	(stop-words,	stemming,	etc):
Accuracy	drops:	lose	information.
Example:		“studying”,	“studies”->	“studi”
▪Training	set:
•Arbitrarily	chosen	by	journals:	overlaps
•Noisy	contents:	irrelevant	sentences
Example:		“We	examined	a	patient	who	had	salad...”
•No	“the	best	case”/	baselines	for	the	system
Discussions
▪ Dataset
•In-domain	knowledge:	papers,	books,	etc
•For	specific	tasks:	well-labeled	
▪Representation
•CNN	model:	more	complex	(layers)
•Other	models:	Long-short	Term	Memory(LSTM),	etc
▪Potential	Applications
•Notes	classification
•Patient2vec	(Use	Case	next	page):	representation	
learning	on	individual	patient
Future	Works
Patient2Vec:	
Every	patient	is	a	vector
Feature	extraction from	everything:	
gender,age,	body	conditions,	history	
treatments,	…
Special	thanks	to	Spyros	Kotoulas1 and	Toyotaro	Suzumura2 for	support	and	help.
1IBM	Watson	Health,	Dublin,	Ireland
2IBM	T.J.	Watson	Research	Center,	New	York,	USA
Thanks!
Q&A
ireneli.eu

More Related Content

Similar to Representation Learning in Medical Documents

The ICF and Therapy Goals
The ICF and Therapy GoalsThe ICF and Therapy Goals
The ICF and Therapy Goals
Olaf Kraus de Camargo
 
Handout final acs women in surgery 10.3.12a
Handout final acs women in surgery 10.3.12aHandout final acs women in surgery 10.3.12a
Handout final acs women in surgery 10.3.12a
Amalia Cochran
 
Test bank-for-human-learning-7th-edition-ormrod
Test bank-for-human-learning-7th-edition-ormrodTest bank-for-human-learning-7th-edition-ormrod
Test bank-for-human-learning-7th-edition-ormrod
tolemabeare
 
Intelligence and Achievement
Intelligence and AchievementIntelligence and Achievement
Intelligence and Achievement
Arts Academy at Benjamin Rush
 
Erc Manchester Medical Education 2010
Erc Manchester Medical Education 2010Erc Manchester Medical Education 2010
Erc Manchester Medical Education 2010
Kurt Wilson
 
Multitasking group ppt
Multitasking group pptMultitasking group ppt
Multitasking group ppt
Purnendra Shrivastava
 
Successful Teaching, Learning and Design - Cat I & Cat II Orientation - Augus...
Successful Teaching, Learning and Design - Cat I & Cat II Orientation - Augus...Successful Teaching, Learning and Design - Cat I & Cat II Orientation - Augus...
Successful Teaching, Learning and Design - Cat I & Cat II Orientation - Augus...
Jeff Loats
 
Online Educator Burnout
Online Educator BurnoutOnline Educator Burnout
Online Educator Burnout
John Moore
 
Chapter.1
Chapter.1Chapter.1
Chapter.1
jvirwin
 
Digital Education for Clinical Education
Digital Education for Clinical EducationDigital Education for Clinical Education
Digital Education for Clinical Education
Janet Corral
 
Cog1st Guide
Cog1st GuideCog1st Guide
Cog1st Guide
kgadmins
 
Professionalism in medicine
Professionalism in medicineProfessionalism in medicine
Professionalism in medicine
Dr Ghaiath Hussein
 
Dijkstra
DijkstraDijkstra
Professionalism in medicine (Dr. Mohamed Al-Rukban)
Professionalism in medicine (Dr. Mohamed Al-Rukban)Professionalism in medicine (Dr. Mohamed Al-Rukban)
Professionalism in medicine (Dr. Mohamed Al-Rukban)
Dr Ghaiath Hussein
 
343%20 learner%20differences%20and%20learning%20needs1
343%20 learner%20differences%20and%20learning%20needs1343%20 learner%20differences%20and%20learning%20needs1
343%20 learner%20differences%20and%20learning%20needs1
Anna Montes
 
Talent and development
Talent and developmentTalent and development
Talent and development
Pekka Suominen
 
Specfic phobia
Specfic phobiaSpecfic phobia
Specfic phobia
Mostafa Al Zayat
 

Similar to Representation Learning in Medical Documents (17)

The ICF and Therapy Goals
The ICF and Therapy GoalsThe ICF and Therapy Goals
The ICF and Therapy Goals
 
Handout final acs women in surgery 10.3.12a
Handout final acs women in surgery 10.3.12aHandout final acs women in surgery 10.3.12a
Handout final acs women in surgery 10.3.12a
 
Test bank-for-human-learning-7th-edition-ormrod
Test bank-for-human-learning-7th-edition-ormrodTest bank-for-human-learning-7th-edition-ormrod
Test bank-for-human-learning-7th-edition-ormrod
 
Intelligence and Achievement
Intelligence and AchievementIntelligence and Achievement
Intelligence and Achievement
 
Erc Manchester Medical Education 2010
Erc Manchester Medical Education 2010Erc Manchester Medical Education 2010
Erc Manchester Medical Education 2010
 
Multitasking group ppt
Multitasking group pptMultitasking group ppt
Multitasking group ppt
 
Successful Teaching, Learning and Design - Cat I & Cat II Orientation - Augus...
Successful Teaching, Learning and Design - Cat I & Cat II Orientation - Augus...Successful Teaching, Learning and Design - Cat I & Cat II Orientation - Augus...
Successful Teaching, Learning and Design - Cat I & Cat II Orientation - Augus...
 
Online Educator Burnout
Online Educator BurnoutOnline Educator Burnout
Online Educator Burnout
 
Chapter.1
Chapter.1Chapter.1
Chapter.1
 
Digital Education for Clinical Education
Digital Education for Clinical EducationDigital Education for Clinical Education
Digital Education for Clinical Education
 
Cog1st Guide
Cog1st GuideCog1st Guide
Cog1st Guide
 
Professionalism in medicine
Professionalism in medicineProfessionalism in medicine
Professionalism in medicine
 
Dijkstra
DijkstraDijkstra
Dijkstra
 
Professionalism in medicine (Dr. Mohamed Al-Rukban)
Professionalism in medicine (Dr. Mohamed Al-Rukban)Professionalism in medicine (Dr. Mohamed Al-Rukban)
Professionalism in medicine (Dr. Mohamed Al-Rukban)
 
343%20 learner%20differences%20and%20learning%20needs1
343%20 learner%20differences%20and%20learning%20needs1343%20 learner%20differences%20and%20learning%20needs1
343%20 learner%20differences%20and%20learning%20needs1
 
Talent and development
Talent and developmentTalent and development
Talent and development
 
Specfic phobia
Specfic phobiaSpecfic phobia
Specfic phobia
 

Recently uploaded

一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
y3i0qsdzb
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
AlessioFois2
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
wyddcwye1
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 

Recently uploaded (20)

一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
A presentation that explain the Power BI Licensing
A presentation that explain the Power BI LicensingA presentation that explain the Power BI Licensing
A presentation that explain the Power BI Licensing
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 

Representation Learning in Medical Documents