SlideShare a Scribd company logo
Finding Different Types of Medical Conditions:
From Data Generation to Automatic Classificationjj
Nathan Artz1, Stephen Doogan1, Jinho D. Choi, PhD2
1Real Life Sciences, New York, NY; 2Emory University, Atlanta, GA
rlsciences.com
info@rlsciences.com
quantitative.emory.edu
choi@mathcs.emory.edu
INTRODUCTION RESULTSAPPROACH IMPLICATIONS
• A key challenge to real world outcomes research lies in
analyzing medical conditions in unstructured data. Central to
this is determining whether conditions are reported as
symptoms of disease, treatment indications or side effects.
• In analyzing texts from patient reporting systems (e.g., FAERS,
social media), researchers often use medical dictionaries such
as MedDRA to produce ranked frequency counts of medical
conditions1; however, this approach fails to differentiate
treatment indications and side effect conditions, which is a
critical part of assessing treatment outcomes.
• Automated classification of medical conditions can provide a
deeper understanding of treatment outcomes and promote
efficient real world data investigations across a variety of patient
level data.
• In our study, we collected 19,313 spontaneous patient reports
about antibiotic treatment experiences from medical forums. We
annotated 760 sentences containing different types of medical
conditions, and classified 5,179 unique conditions into
indications and side effects across the set of reports using
natural language processing and machine learning techniques.
• Each of the 760 annotated sentences were given a binary label
indicating whether the condition played the role of an “indication”
or “side effect” of the given treatment. Specific treatment and
condition mentions in text were replaced by generic text labels
(i.e. _TREATMENT, _CONDITION) to prevent overfitting to the
antibiotic drug class.
• We used an SVM classifier with 5-fold cross validation and
averaged the outcomes of the folds to determine the F1 scores
(see Figure 2).
• We manually reviewed some of the most impactful features of
the SVM to see which are most important when differentiating
the two classes of conditions.
• For conditions playing the role of indication, we see interesting
features such as a Bigram(treatment, for) and
syntactic_parent(condition, for) as in “on steroids for the
inflammation”. For side effects, we observe features such as
SyntacticParent(condition, make) or SyntacticParent(condition,
give) as in “treatment made me sick” or “the drug gave me an
allergic reaction” which heavily influence the decision boundary
of the SVM.
• We believe this is a feasible approach to differentiating the roles
of medical conditions. This approach is suggested for use in
spontaneous reporting systems where many of the conditions
can be assumed to play side effect or indication roles.
• Our study can enable researchers, regulators, health insurers
and pharmaceutical companies to better monitor treatment risk
by filtering out instances of indications and automatically identify
side effects in spontaneous reports.
• Our approach can provide significant time savings in
reviewing spontaneous reports for the purpose of side effect
identification, enabling untapped data to be leveraged for
real world treatment outcomes research.
• Our follow on work includes applying these models across
doctor’s notes in electronic health records (EHRs), case
record forms within pharmaceutical companies and signal
detection across social media and published literature.
Can a machine differentiate between side effects and indications in spontaneous patient reports?
EXPERIMENT FEATURES F1 SCORE
1 78.23
84.26
83.30
2
3 Bag of Lemmas + Syntactic Features +
Syntactic Roles
Bag of Lemmas
Bag of Lemmas + Syntactic Features
Figure 2. F1 scores for three experiments using different feature sets
REFERENCES
1. Gurulingappa H, Toldo L, Rajput AM, Kors JA, Taweel A, Tayrouz Y. Automatic detection of adverse events to predict drug label
changes using text and data mining techniques. Phar. Drug Saf. 2013;22(11):1189-94.
2. Choi JD, McCallum A. Transition-based Dependency Parsing with Selectional Branching. ACL’13. 2013 1052-1062.
Figure 1. An overview of the classification approach
CONDITION
DEPENDENCY GRAPHS
Indication I am taking cipro for my infection
I got another infection after I was put on ciproSide Effect
“” “”
TEXT
taking taking
RAW
TEXT
STATISTICAL
MODEL
CLASSIFICATION
OUTPUT
taketake havehave
II IIciprocipro infectioninfectioninfectioninfection ciprocipro
cause timetheme themeagent agent
• Various combinations of syntactic and semantic features were
extracted from dependency graphs generated by an NLP toolkit,
ClearNLP2, and were used for generating a statistical model
trained by support vector machines (SVM). This statistical model is
used for classifying medical conditions in raw text. Figure 1 shows
the overview of our approach.
1
1
1 1
2
2
2

More Related Content

What's hot

medication error reporting system
 medication error reporting system medication error reporting system
medication error reporting system
MEEQAT HOSPITAL
 
medical error
 medical error medical error
medical error
Chanda Jabeen
 
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
robirish51
 
An Empirical Study on Mushroom Disease Diagnosis:A Data Mining Approach
An Empirical Study on Mushroom Disease Diagnosis:A Data Mining ApproachAn Empirical Study on Mushroom Disease Diagnosis:A Data Mining Approach
An Empirical Study on Mushroom Disease Diagnosis:A Data Mining Approach
IRJET Journal
 
What are the applications of Biostatistics in Pharmacy?
What are the applications of Biostatistics in Pharmacy?What are the applications of Biostatistics in Pharmacy?
What are the applications of Biostatistics in Pharmacy?
pharmacampus
 
Basics of experimental design
Basics of experimental designBasics of experimental design
Basics of experimental design
NadeemAltaf2
 
An overview of fixed effects assumptions for meta analysis - Pubrica
An overview of fixed effects assumptions for meta analysis - PubricaAn overview of fixed effects assumptions for meta analysis - Pubrica
An overview of fixed effects assumptions for meta analysis - Pubrica
Pubrica
 
Vi-Med tool for medication review - Form 3 - English version
Vi-Med tool for medication review - Form 3 - English versionVi-Med tool for medication review - Form 3 - English version
Vi-Med tool for medication review - Form 3 - English version
HA VO THI
 
Meta-Analysis in Ayurveda
Meta-Analysis in AyurvedaMeta-Analysis in Ayurveda
Meta-Analysis in Ayurveda
Ayurdata
 
The methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive modelThe methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive model
pingxiaoou
 
The methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive modelThe methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive model
pingxiaoou
 
The role of statistics in Medicine
The role of statistics in MedicineThe role of statistics in Medicine
The role of statistics in Medicine
yinka ADENIRAN
 
11. data management
11. data management11. data management
11. data management
Ashok Kulkarni
 
Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)
Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)
Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)
Vaggelis Vergoulas
 
5.2.1 dags
5.2.1 dags5.2.1 dags
5.2.1 dags
A M
 
Masters thesis differential effectiveness of substance abuse treatment by j f...
Masters thesis differential effectiveness of substance abuse treatment by j f...Masters thesis differential effectiveness of substance abuse treatment by j f...
Masters thesis differential effectiveness of substance abuse treatment by j f...
Joyce Fuller
 
Medication error reporting system
Medication error reporting systemMedication error reporting system
Medication error reporting system
MEEQAT HOSPITAL
 
Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...
Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...
Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...
M. Luisetto Pharm.D.Spec. Pharmacology
 
Introduction of Biostatistics
Introduction of BiostatisticsIntroduction of Biostatistics
Introduction of Biostatistics
Sir Parashurambhau College, Pune
 
Rodriguez_UROC_Final_Poster
Rodriguez_UROC_Final_PosterRodriguez_UROC_Final_Poster
Rodriguez_UROC_Final_Poster
​Iván Rodríguez
 

What's hot (20)

medication error reporting system
 medication error reporting system medication error reporting system
medication error reporting system
 
medical error
 medical error medical error
medical error
 
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
 
An Empirical Study on Mushroom Disease Diagnosis:A Data Mining Approach
An Empirical Study on Mushroom Disease Diagnosis:A Data Mining ApproachAn Empirical Study on Mushroom Disease Diagnosis:A Data Mining Approach
An Empirical Study on Mushroom Disease Diagnosis:A Data Mining Approach
 
What are the applications of Biostatistics in Pharmacy?
What are the applications of Biostatistics in Pharmacy?What are the applications of Biostatistics in Pharmacy?
What are the applications of Biostatistics in Pharmacy?
 
Basics of experimental design
Basics of experimental designBasics of experimental design
Basics of experimental design
 
An overview of fixed effects assumptions for meta analysis - Pubrica
An overview of fixed effects assumptions for meta analysis - PubricaAn overview of fixed effects assumptions for meta analysis - Pubrica
An overview of fixed effects assumptions for meta analysis - Pubrica
 
Vi-Med tool for medication review - Form 3 - English version
Vi-Med tool for medication review - Form 3 - English versionVi-Med tool for medication review - Form 3 - English version
Vi-Med tool for medication review - Form 3 - English version
 
Meta-Analysis in Ayurveda
Meta-Analysis in AyurvedaMeta-Analysis in Ayurveda
Meta-Analysis in Ayurveda
 
The methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive modelThe methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive model
 
The methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive modelThe methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive model
 
The role of statistics in Medicine
The role of statistics in MedicineThe role of statistics in Medicine
The role of statistics in Medicine
 
11. data management
11. data management11. data management
11. data management
 
Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)
Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)
Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)
 
5.2.1 dags
5.2.1 dags5.2.1 dags
5.2.1 dags
 
Masters thesis differential effectiveness of substance abuse treatment by j f...
Masters thesis differential effectiveness of substance abuse treatment by j f...Masters thesis differential effectiveness of substance abuse treatment by j f...
Masters thesis differential effectiveness of substance abuse treatment by j f...
 
Medication error reporting system
Medication error reporting systemMedication error reporting system
Medication error reporting system
 
Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...
Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...
Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...
 
Introduction of Biostatistics
Introduction of BiostatisticsIntroduction of Biostatistics
Introduction of Biostatistics
 
Rodriguez_UROC_Final_Poster
Rodriguez_UROC_Final_PosterRodriguez_UROC_Final_Poster
Rodriguez_UROC_Final_Poster
 

Similar to Finding Different Types of Medical Conditions: From Data Generation to Automatic Classification

Quantitative Synthesis II
Quantitative Synthesis IIQuantitative Synthesis II
Quantitative Synthesis II
Effective Health Care Program
 
Pharmacoepidemiology by Priya Malik ( M.Pharm)
Pharmacoepidemiology by Priya Malik ( M.Pharm) Pharmacoepidemiology by Priya Malik ( M.Pharm)
Pharmacoepidemiology by Priya Malik ( M.Pharm)
priyamalik43
 
WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3
WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3
WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3
Arete-Zoe, LLC
 
Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...
University of Malaya
 
EOP.SOJA.S5
EOP.SOJA.S5EOP.SOJA.S5
EOP.SOJA.S5
Rob Janknegt
 
Csit110713
Csit110713Csit110713
Csit110713
gerogepatton
 
uptodate, interaksiobat.docx
uptodate, interaksiobat.docxuptodate, interaksiobat.docx
uptodate, interaksiobat.docx
MuhammadMuhlis9
 
Pms
PmsPms
CPT.BigHealthcareData.2016
CPT.BigHealthcareData.2016CPT.BigHealthcareData.2016
CPT.BigHealthcareData.2016
Sebastian Schneeweiss
 
Pharmacovigilance methods.pptx 122227112
Pharmacovigilance methods.pptx 122227112Pharmacovigilance methods.pptx 122227112
Pharmacovigilance methods.pptx 122227112
sana916816
 
PHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docx
PHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docxPHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docx
PHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docx
mattjtoni51554
 
Evidence live 2015 -hierarchical levels of evidence based medicine are incor...
Evidence live 2015  -hierarchical levels of evidence based medicine are incor...Evidence live 2015  -hierarchical levels of evidence based medicine are incor...
Evidence live 2015 -hierarchical levels of evidence based medicine are incor...
Jorge Ramírez
 
DISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNING
DISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNINGDISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNING
DISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNING
vishnuRajan20
 
Disease inference from health-related uestions vissparse deep learning
Disease inference from health-related uestions vissparse deep learningDisease inference from health-related uestions vissparse deep learning
Disease inference from health-related uestions vissparse deep learning
vishnuRajan20
 
introductiontopharmacoepidemiology-230613144442-c713d639.pdf
introductiontopharmacoepidemiology-230613144442-c713d639.pdfintroductiontopharmacoepidemiology-230613144442-c713d639.pdf
introductiontopharmacoepidemiology-230613144442-c713d639.pdf
Ogunsina1
 
INTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptx
INTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptxINTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptx
INTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptx
Ameena Kadar
 
Rough Draft Quantitative ResearchCourtney Taylor.docx
Rough Draft Quantitative ResearchCourtney Taylor.docxRough Draft Quantitative ResearchCourtney Taylor.docx
Rough Draft Quantitative ResearchCourtney Taylor.docx
daniely50
 
Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...
Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...
Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...
Lorenzo Moja
 
Pharmacovigilance pdf
Pharmacovigilance pdfPharmacovigilance pdf
Pharmacovigilance pdf
Chandigarh College of Pharmacy
 
COMMUNITY PHARMACY PRESENTATION FINAL.pptx
COMMUNITY PHARMACY PRESENTATION FINAL.pptxCOMMUNITY PHARMACY PRESENTATION FINAL.pptx
COMMUNITY PHARMACY PRESENTATION FINAL.pptx
sardarjarrar
 

Similar to Finding Different Types of Medical Conditions: From Data Generation to Automatic Classification (20)

Quantitative Synthesis II
Quantitative Synthesis IIQuantitative Synthesis II
Quantitative Synthesis II
 
Pharmacoepidemiology by Priya Malik ( M.Pharm)
Pharmacoepidemiology by Priya Malik ( M.Pharm) Pharmacoepidemiology by Priya Malik ( M.Pharm)
Pharmacoepidemiology by Priya Malik ( M.Pharm)
 
WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3
WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3
WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3
 
Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...
 
EOP.SOJA.S5
EOP.SOJA.S5EOP.SOJA.S5
EOP.SOJA.S5
 
Csit110713
Csit110713Csit110713
Csit110713
 
uptodate, interaksiobat.docx
uptodate, interaksiobat.docxuptodate, interaksiobat.docx
uptodate, interaksiobat.docx
 
Pms
PmsPms
Pms
 
CPT.BigHealthcareData.2016
CPT.BigHealthcareData.2016CPT.BigHealthcareData.2016
CPT.BigHealthcareData.2016
 
Pharmacovigilance methods.pptx 122227112
Pharmacovigilance methods.pptx 122227112Pharmacovigilance methods.pptx 122227112
Pharmacovigilance methods.pptx 122227112
 
PHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docx
PHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docxPHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docx
PHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docx
 
Evidence live 2015 -hierarchical levels of evidence based medicine are incor...
Evidence live 2015  -hierarchical levels of evidence based medicine are incor...Evidence live 2015  -hierarchical levels of evidence based medicine are incor...
Evidence live 2015 -hierarchical levels of evidence based medicine are incor...
 
DISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNING
DISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNINGDISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNING
DISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNING
 
Disease inference from health-related uestions vissparse deep learning
Disease inference from health-related uestions vissparse deep learningDisease inference from health-related uestions vissparse deep learning
Disease inference from health-related uestions vissparse deep learning
 
introductiontopharmacoepidemiology-230613144442-c713d639.pdf
introductiontopharmacoepidemiology-230613144442-c713d639.pdfintroductiontopharmacoepidemiology-230613144442-c713d639.pdf
introductiontopharmacoepidemiology-230613144442-c713d639.pdf
 
INTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptx
INTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptxINTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptx
INTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptx
 
Rough Draft Quantitative ResearchCourtney Taylor.docx
Rough Draft Quantitative ResearchCourtney Taylor.docxRough Draft Quantitative ResearchCourtney Taylor.docx
Rough Draft Quantitative ResearchCourtney Taylor.docx
 
Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...
Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...
Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...
 
Pharmacovigilance pdf
Pharmacovigilance pdfPharmacovigilance pdf
Pharmacovigilance pdf
 
COMMUNITY PHARMACY PRESENTATION FINAL.pptx
COMMUNITY PHARMACY PRESENTATION FINAL.pptxCOMMUNITY PHARMACY PRESENTATION FINAL.pptx
COMMUNITY PHARMACY PRESENTATION FINAL.pptx
 

More from Jinho Choi

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Jinho Choi
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Jinho Choi
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Jinho Choi
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Jinho Choi
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference Resolution
Jinho Choi
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Jinho Choi
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning Representation
Jinho Choi
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
Jinho Choi
 
CKY Parsing
CKY ParsingCKY Parsing
CKY Parsing
Jinho Choi
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet Similarities
Jinho Choi
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical Relations
Jinho Choi
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue Management
Jinho Choi
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR Parsing
Jinho Choi
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to Dialogue
Jinho Choi
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue Understanding
Jinho Choi
 
Topological Sort
Topological SortTopological Sort
Topological Sort
Jinho Choi
 
Tries - Put
Tries - PutTries - Put
Tries - Put
Jinho Choi
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Jinho Choi
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Jinho Choi
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports Intelligently
Jinho Choi
 

More from Jinho Choi (20)

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference Resolution
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning Representation
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
 
CKY Parsing
CKY ParsingCKY Parsing
CKY Parsing
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet Similarities
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical Relations
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue Management
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR Parsing
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to Dialogue
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue Understanding
 
Topological Sort
Topological SortTopological Sort
Topological Sort
 
Tries - Put
Tries - PutTries - Put
Tries - Put
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports Intelligently
 

Recently uploaded

Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!
Tobias Schneck
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
UiPathCommunity
 
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfLee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
leebarnesutopia
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!
Ortus Solutions, Corp
 
What is an RPA CoE? Session 2 – CoE Roles
What is an RPA CoE?  Session 2 – CoE RolesWhat is an RPA CoE?  Session 2 – CoE Roles
What is an RPA CoE? Session 2 – CoE Roles
DianaGray10
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
ScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking ReplicationScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking Replication
ScyllaDB
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's TipsGetting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
ScyllaDB
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
christinelarrosa
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
Fwdays
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Enterprise Knowledge
 
"What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w..."What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w...
Fwdays
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
AI in the Workplace Reskilling, Upskilling, and Future Work.pptx
AI in the Workplace Reskilling, Upskilling, and Future Work.pptxAI in the Workplace Reskilling, Upskilling, and Future Work.pptx
AI in the Workplace Reskilling, Upskilling, and Future Work.pptx
Sunil Jagani
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 

Recently uploaded (20)

Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
 
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfLee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdf
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!
 
What is an RPA CoE? Session 2 – CoE Roles
What is an RPA CoE?  Session 2 – CoE RolesWhat is an RPA CoE?  Session 2 – CoE Roles
What is an RPA CoE? Session 2 – CoE Roles
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
ScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking ReplicationScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking Replication
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's TipsGetting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
Getting the Most Out of ScyllaDB Monitoring: ShareChat's Tips
 
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptxPRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
PRODUCT LISTING OPTIMIZATION PRESENTATION.pptx
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
 
"What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w..."What does it really mean for your system to be available, or how to define w...
"What does it really mean for your system to be available, or how to define w...
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
AI in the Workplace Reskilling, Upskilling, and Future Work.pptx
AI in the Workplace Reskilling, Upskilling, and Future Work.pptxAI in the Workplace Reskilling, Upskilling, and Future Work.pptx
AI in the Workplace Reskilling, Upskilling, and Future Work.pptx
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 

Finding Different Types of Medical Conditions: From Data Generation to Automatic Classification

  • 1. Finding Different Types of Medical Conditions: From Data Generation to Automatic Classificationjj Nathan Artz1, Stephen Doogan1, Jinho D. Choi, PhD2 1Real Life Sciences, New York, NY; 2Emory University, Atlanta, GA rlsciences.com info@rlsciences.com quantitative.emory.edu choi@mathcs.emory.edu INTRODUCTION RESULTSAPPROACH IMPLICATIONS • A key challenge to real world outcomes research lies in analyzing medical conditions in unstructured data. Central to this is determining whether conditions are reported as symptoms of disease, treatment indications or side effects. • In analyzing texts from patient reporting systems (e.g., FAERS, social media), researchers often use medical dictionaries such as MedDRA to produce ranked frequency counts of medical conditions1; however, this approach fails to differentiate treatment indications and side effect conditions, which is a critical part of assessing treatment outcomes. • Automated classification of medical conditions can provide a deeper understanding of treatment outcomes and promote efficient real world data investigations across a variety of patient level data. • In our study, we collected 19,313 spontaneous patient reports about antibiotic treatment experiences from medical forums. We annotated 760 sentences containing different types of medical conditions, and classified 5,179 unique conditions into indications and side effects across the set of reports using natural language processing and machine learning techniques. • Each of the 760 annotated sentences were given a binary label indicating whether the condition played the role of an “indication” or “side effect” of the given treatment. Specific treatment and condition mentions in text were replaced by generic text labels (i.e. _TREATMENT, _CONDITION) to prevent overfitting to the antibiotic drug class. • We used an SVM classifier with 5-fold cross validation and averaged the outcomes of the folds to determine the F1 scores (see Figure 2). • We manually reviewed some of the most impactful features of the SVM to see which are most important when differentiating the two classes of conditions. • For conditions playing the role of indication, we see interesting features such as a Bigram(treatment, for) and syntactic_parent(condition, for) as in “on steroids for the inflammation”. For side effects, we observe features such as SyntacticParent(condition, make) or SyntacticParent(condition, give) as in “treatment made me sick” or “the drug gave me an allergic reaction” which heavily influence the decision boundary of the SVM. • We believe this is a feasible approach to differentiating the roles of medical conditions. This approach is suggested for use in spontaneous reporting systems where many of the conditions can be assumed to play side effect or indication roles. • Our study can enable researchers, regulators, health insurers and pharmaceutical companies to better monitor treatment risk by filtering out instances of indications and automatically identify side effects in spontaneous reports. • Our approach can provide significant time savings in reviewing spontaneous reports for the purpose of side effect identification, enabling untapped data to be leveraged for real world treatment outcomes research. • Our follow on work includes applying these models across doctor’s notes in electronic health records (EHRs), case record forms within pharmaceutical companies and signal detection across social media and published literature. Can a machine differentiate between side effects and indications in spontaneous patient reports? EXPERIMENT FEATURES F1 SCORE 1 78.23 84.26 83.30 2 3 Bag of Lemmas + Syntactic Features + Syntactic Roles Bag of Lemmas Bag of Lemmas + Syntactic Features Figure 2. F1 scores for three experiments using different feature sets REFERENCES 1. Gurulingappa H, Toldo L, Rajput AM, Kors JA, Taweel A, Tayrouz Y. Automatic detection of adverse events to predict drug label changes using text and data mining techniques. Phar. Drug Saf. 2013;22(11):1189-94. 2. Choi JD, McCallum A. Transition-based Dependency Parsing with Selectional Branching. ACL’13. 2013 1052-1062. Figure 1. An overview of the classification approach CONDITION DEPENDENCY GRAPHS Indication I am taking cipro for my infection I got another infection after I was put on ciproSide Effect “” “” TEXT taking taking RAW TEXT STATISTICAL MODEL CLASSIFICATION OUTPUT taketake havehave II IIciprocipro infectioninfectioninfectioninfection ciprocipro cause timetheme themeagent agent • Various combinations of syntactic and semantic features were extracted from dependency graphs generated by an NLP toolkit, ClearNLP2, and were used for generating a statistical model trained by support vector machines (SVM). This statistical model is used for classifying medical conditions in raw text. Figure 1 shows the overview of our approach. 1 1 1 1 2 2 2