SlideShare a Scribd company logo
1 of 1
Download to read offline
Finding Different Types of Medical Conditions:
From Data Generation to Automatic Classificationjj
Nathan Artz1, Stephen Doogan1, Jinho D. Choi, PhD2
1Real Life Sciences, New York, NY; 2Emory University, Atlanta, GA
rlsciences.com
info@rlsciences.com
quantitative.emory.edu
choi@mathcs.emory.edu
INTRODUCTION RESULTSAPPROACH IMPLICATIONS
• A key challenge to real world outcomes research lies in
analyzing medical conditions in unstructured data. Central to
this is determining whether conditions are reported as
symptoms of disease, treatment indications or side effects.
• In analyzing texts from patient reporting systems (e.g., FAERS,
social media), researchers often use medical dictionaries such
as MedDRA to produce ranked frequency counts of medical
conditions1; however, this approach fails to differentiate
treatment indications and side effect conditions, which is a
critical part of assessing treatment outcomes.
• Automated classification of medical conditions can provide a
deeper understanding of treatment outcomes and promote
efficient real world data investigations across a variety of patient
level data.
• In our study, we collected 19,313 spontaneous patient reports
about antibiotic treatment experiences from medical forums. We
annotated 760 sentences containing different types of medical
conditions, and classified 5,179 unique conditions into
indications and side effects across the set of reports using
natural language processing and machine learning techniques.
• Each of the 760 annotated sentences were given a binary label
indicating whether the condition played the role of an “indication”
or “side effect” of the given treatment. Specific treatment and
condition mentions in text were replaced by generic text labels
(i.e. _TREATMENT, _CONDITION) to prevent overfitting to the
antibiotic drug class.
• We used an SVM classifier with 5-fold cross validation and
averaged the outcomes of the folds to determine the F1 scores
(see Figure 2).
• We manually reviewed some of the most impactful features of
the SVM to see which are most important when differentiating
the two classes of conditions.
• For conditions playing the role of indication, we see interesting
features such as a Bigram(treatment, for) and
syntactic_parent(condition, for) as in “on steroids for the
inflammation”. For side effects, we observe features such as
SyntacticParent(condition, make) or SyntacticParent(condition,
give) as in “treatment made me sick” or “the drug gave me an
allergic reaction” which heavily influence the decision boundary
of the SVM.
• We believe this is a feasible approach to differentiating the roles
of medical conditions. This approach is suggested for use in
spontaneous reporting systems where many of the conditions
can be assumed to play side effect or indication roles.
• Our study can enable researchers, regulators, health insurers
and pharmaceutical companies to better monitor treatment risk
by filtering out instances of indications and automatically identify
side effects in spontaneous reports.
• Our approach can provide significant time savings in
reviewing spontaneous reports for the purpose of side effect
identification, enabling untapped data to be leveraged for
real world treatment outcomes research.
• Our follow on work includes applying these models across
doctor’s notes in electronic health records (EHRs), case
record forms within pharmaceutical companies and signal
detection across social media and published literature.
Can a machine differentiate between side effects and indications in spontaneous patient reports?
EXPERIMENT FEATURES F1 SCORE
1 78.23
84.26
83.30
2
3 Bag of Lemmas + Syntactic Features +
Syntactic Roles
Bag of Lemmas
Bag of Lemmas + Syntactic Features
Figure 2. F1 scores for three experiments using different feature sets
REFERENCES
1. Gurulingappa H, Toldo L, Rajput AM, Kors JA, Taweel A, Tayrouz Y. Automatic detection of adverse events to predict drug label
changes using text and data mining techniques. Phar. Drug Saf. 2013;22(11):1189-94.
2. Choi JD, McCallum A. Transition-based Dependency Parsing with Selectional Branching. ACL’13. 2013 1052-1062.
Figure 1. An overview of the classification approach
CONDITION
DEPENDENCY GRAPHS
Indication I am taking cipro for my infection
I got another infection after I was put on ciproSide Effect
“” “”
TEXT
taking taking
RAW
TEXT
STATISTICAL
MODEL
CLASSIFICATION
OUTPUT
taketake havehave
II IIciprocipro infectioninfectioninfectioninfection ciprocipro
cause timetheme themeagent agent
• Various combinations of syntactic and semantic features were
extracted from dependency graphs generated by an NLP toolkit,
ClearNLP2, and were used for generating a statistical model
trained by support vector machines (SVM). This statistical model is
used for classifying medical conditions in raw text. Figure 1 shows
the overview of our approach.
1
1
1 1
2
2
2

More Related Content

What's hot

medication error reporting system
 medication error reporting system medication error reporting system
medication error reporting systemMEEQAT HOSPITAL
 
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...robirish51
 
An Empirical Study on Mushroom Disease Diagnosis:A Data Mining Approach
An Empirical Study on Mushroom Disease Diagnosis:A Data Mining ApproachAn Empirical Study on Mushroom Disease Diagnosis:A Data Mining Approach
An Empirical Study on Mushroom Disease Diagnosis:A Data Mining ApproachIRJET Journal
 
What are the applications of Biostatistics in Pharmacy?
What are the applications of Biostatistics in Pharmacy?What are the applications of Biostatistics in Pharmacy?
What are the applications of Biostatistics in Pharmacy?pharmacampus
 
Basics of experimental design
Basics of experimental designBasics of experimental design
Basics of experimental designNadeemAltaf2
 
An overview of fixed effects assumptions for meta analysis - Pubrica
An overview of fixed effects assumptions for meta analysis - PubricaAn overview of fixed effects assumptions for meta analysis - Pubrica
An overview of fixed effects assumptions for meta analysis - PubricaPubrica
 
Vi-Med tool for medication review - Form 3 - English version
Vi-Med tool for medication review - Form 3 - English versionVi-Med tool for medication review - Form 3 - English version
Vi-Med tool for medication review - Form 3 - English versionHA VO THI
 
Meta-Analysis in Ayurveda
Meta-Analysis in AyurvedaMeta-Analysis in Ayurveda
Meta-Analysis in AyurvedaAyurdata
 
The methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive modelThe methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive modelpingxiaoou
 
The methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive modelThe methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive modelpingxiaoou
 
The role of statistics in Medicine
The role of statistics in MedicineThe role of statistics in Medicine
The role of statistics in Medicineyinka ADENIRAN
 
Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)
Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)
Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)Vaggelis Vergoulas
 
5.2.1 dags
5.2.1 dags5.2.1 dags
5.2.1 dagsA M
 
Masters thesis differential effectiveness of substance abuse treatment by j f...
Masters thesis differential effectiveness of substance abuse treatment by j f...Masters thesis differential effectiveness of substance abuse treatment by j f...
Masters thesis differential effectiveness of substance abuse treatment by j f...Joyce Fuller
 
Medication error reporting system
Medication error reporting systemMedication error reporting system
Medication error reporting systemMEEQAT HOSPITAL
 
Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...
Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...
Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...M. Luisetto Pharm.D.Spec. Pharmacology
 

What's hot (20)

medication error reporting system
 medication error reporting system medication error reporting system
medication error reporting system
 
medical error
 medical error medical error
medical error
 
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
Population Pharmacokinetic Modelling of an investigational prodrug. Crunenber...
 
An Empirical Study on Mushroom Disease Diagnosis:A Data Mining Approach
An Empirical Study on Mushroom Disease Diagnosis:A Data Mining ApproachAn Empirical Study on Mushroom Disease Diagnosis:A Data Mining Approach
An Empirical Study on Mushroom Disease Diagnosis:A Data Mining Approach
 
What are the applications of Biostatistics in Pharmacy?
What are the applications of Biostatistics in Pharmacy?What are the applications of Biostatistics in Pharmacy?
What are the applications of Biostatistics in Pharmacy?
 
Basics of experimental design
Basics of experimental designBasics of experimental design
Basics of experimental design
 
An overview of fixed effects assumptions for meta analysis - Pubrica
An overview of fixed effects assumptions for meta analysis - PubricaAn overview of fixed effects assumptions for meta analysis - Pubrica
An overview of fixed effects assumptions for meta analysis - Pubrica
 
Vi-Med tool for medication review - Form 3 - English version
Vi-Med tool for medication review - Form 3 - English versionVi-Med tool for medication review - Form 3 - English version
Vi-Med tool for medication review - Form 3 - English version
 
Meta-Analysis in Ayurveda
Meta-Analysis in AyurvedaMeta-Analysis in Ayurveda
Meta-Analysis in Ayurveda
 
The methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive modelThe methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive model
 
The methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive modelThe methodology for handling missing data during development of predictive model
The methodology for handling missing data during development of predictive model
 
The role of statistics in Medicine
The role of statistics in MedicineThe role of statistics in Medicine
The role of statistics in Medicine
 
11. data management
11. data management11. data management
11. data management
 
Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)
Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)
Vergoulas Choosing the appropriate statistical test (2019 Hippokratia journal)
 
5.2.1 dags
5.2.1 dags5.2.1 dags
5.2.1 dags
 
Masters thesis differential effectiveness of substance abuse treatment by j f...
Masters thesis differential effectiveness of substance abuse treatment by j f...Masters thesis differential effectiveness of substance abuse treatment by j f...
Masters thesis differential effectiveness of substance abuse treatment by j f...
 
Medication error reporting system
Medication error reporting systemMedication error reporting system
Medication error reporting system
 
Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...
Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...
Articolo diagnostic imaging and pharmaceutical care Shaw Am. journal of pharm...
 
Introduction of Biostatistics
Introduction of BiostatisticsIntroduction of Biostatistics
Introduction of Biostatistics
 
Rodriguez_UROC_Final_Poster
Rodriguez_UROC_Final_PosterRodriguez_UROC_Final_Poster
Rodriguez_UROC_Final_Poster
 

Similar to Finding Different Types of Medical Conditions: From Data Generation to Automatic Classification

Pharmacoepidemiology by Priya Malik ( M.Pharm)
Pharmacoepidemiology by Priya Malik ( M.Pharm) Pharmacoepidemiology by Priya Malik ( M.Pharm)
Pharmacoepidemiology by Priya Malik ( M.Pharm) priyamalik43
 
WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3
WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3
WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3Arete-Zoe, LLC
 
Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...University of Malaya
 
uptodate, interaksiobat.docx
uptodate, interaksiobat.docxuptodate, interaksiobat.docx
uptodate, interaksiobat.docxMuhammadMuhlis9
 
PHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docx
PHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docxPHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docx
PHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docxmattjtoni51554
 
Evidence live 2015 -hierarchical levels of evidence based medicine are incor...
Evidence live 2015  -hierarchical levels of evidence based medicine are incor...Evidence live 2015  -hierarchical levels of evidence based medicine are incor...
Evidence live 2015 -hierarchical levels of evidence based medicine are incor...Jorge Ramírez
 
Disease inference from health-related uestions vissparse deep learning
Disease inference from health-related uestions vissparse deep learningDisease inference from health-related uestions vissparse deep learning
Disease inference from health-related uestions vissparse deep learningvishnuRajan20
 
DISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNING
DISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNINGDISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNING
DISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNINGvishnuRajan20
 
introductiontopharmacoepidemiology-230613144442-c713d639.pdf
introductiontopharmacoepidemiology-230613144442-c713d639.pdfintroductiontopharmacoepidemiology-230613144442-c713d639.pdf
introductiontopharmacoepidemiology-230613144442-c713d639.pdfOgunsina1
 
INTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptx
INTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptxINTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptx
INTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptxAmeena Kadar
 
Rough Draft Quantitative ResearchCourtney Taylor.docx
Rough Draft Quantitative ResearchCourtney Taylor.docxRough Draft Quantitative ResearchCourtney Taylor.docx
Rough Draft Quantitative ResearchCourtney Taylor.docxdaniely50
 
Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...
Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...
Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...Lorenzo Moja
 
COMMUNITY PHARMACY PRESENTATION FINAL.pptx
COMMUNITY PHARMACY PRESENTATION FINAL.pptxCOMMUNITY PHARMACY PRESENTATION FINAL.pptx
COMMUNITY PHARMACY PRESENTATION FINAL.pptxsardarjarrar
 
Assessing drug safety in clinical trials
Assessing drug safety in clinical trialsAssessing drug safety in clinical trials
Assessing drug safety in clinical trialsRay Wright
 

Similar to Finding Different Types of Medical Conditions: From Data Generation to Automatic Classification (20)

Quantitative Synthesis II
Quantitative Synthesis IIQuantitative Synthesis II
Quantitative Synthesis II
 
Pharmacoepidemiology by Priya Malik ( M.Pharm)
Pharmacoepidemiology by Priya Malik ( M.Pharm) Pharmacoepidemiology by Priya Malik ( M.Pharm)
Pharmacoepidemiology by Priya Malik ( M.Pharm)
 
WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3
WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3
WhitePaper_Arete-Zoe_PRS_STAMP-in-drug-developmentV3
 
Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...Theory and Practice of Integrating Machine Learning and Conventional Statisti...
Theory and Practice of Integrating Machine Learning and Conventional Statisti...
 
EOP.SOJA.S5
EOP.SOJA.S5EOP.SOJA.S5
EOP.SOJA.S5
 
Csit110713
Csit110713Csit110713
Csit110713
 
uptodate, interaksiobat.docx
uptodate, interaksiobat.docxuptodate, interaksiobat.docx
uptodate, interaksiobat.docx
 
Pms
PmsPms
Pms
 
CPT.BigHealthcareData.2016
CPT.BigHealthcareData.2016CPT.BigHealthcareData.2016
CPT.BigHealthcareData.2016
 
PHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docx
PHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docxPHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docx
PHARMACOEPIDEMIOLOGY Edited by Abraham G. Hartzema, Miquel.docx
 
Evidence live 2015 -hierarchical levels of evidence based medicine are incor...
Evidence live 2015  -hierarchical levels of evidence based medicine are incor...Evidence live 2015  -hierarchical levels of evidence based medicine are incor...
Evidence live 2015 -hierarchical levels of evidence based medicine are incor...
 
Disease inference from health-related uestions vissparse deep learning
Disease inference from health-related uestions vissparse deep learningDisease inference from health-related uestions vissparse deep learning
Disease inference from health-related uestions vissparse deep learning
 
DISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNING
DISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNINGDISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNING
DISEASE INFERENCE FROM HEALTH-RELATED QUESTIONS VIA SPARSE DEEP LEARNING
 
introductiontopharmacoepidemiology-230613144442-c713d639.pdf
introductiontopharmacoepidemiology-230613144442-c713d639.pdfintroductiontopharmacoepidemiology-230613144442-c713d639.pdf
introductiontopharmacoepidemiology-230613144442-c713d639.pdf
 
INTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptx
INTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptxINTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptx
INTRODUCTION TO PHARMACOEPIDEMIOLOGY.pptx
 
Rough Draft Quantitative ResearchCourtney Taylor.docx
Rough Draft Quantitative ResearchCourtney Taylor.docxRough Draft Quantitative ResearchCourtney Taylor.docx
Rough Draft Quantitative ResearchCourtney Taylor.docx
 
Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...
Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...
Computer Decision Support Systems and Electronic Health Records: Am J Pub Hea...
 
Pharmacovigilance pdf
Pharmacovigilance pdfPharmacovigilance pdf
Pharmacovigilance pdf
 
COMMUNITY PHARMACY PRESENTATION FINAL.pptx
COMMUNITY PHARMACY PRESENTATION FINAL.pptxCOMMUNITY PHARMACY PRESENTATION FINAL.pptx
COMMUNITY PHARMACY PRESENTATION FINAL.pptx
 
Assessing drug safety in clinical trials
Assessing drug safety in clinical trialsAssessing drug safety in clinical trials
Assessing drug safety in clinical trials
 

More from Jinho Choi

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Jinho Choi
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Jinho Choi
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Jinho Choi
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Jinho Choi
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionJinho Choi
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Jinho Choi
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning RepresentationJinho Choi
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role LabelingJinho Choi
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet SimilaritiesJinho Choi
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical RelationsJinho Choi
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementJinho Choi
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingJinho Choi
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueJinho Choi
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingJinho Choi
 
Topological Sort
Topological SortTopological Sort
Topological SortJinho Choi
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseJinho Choi
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsJinho Choi
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyJinho Choi
 

More from Jinho Choi (20)

Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
 
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
Analysis of Hierarchical Multi-Content Text Classification Model on B-SHARP D...
 
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...Competence-Level Prediction and Resume & Job Description Matching Using Conte...
Competence-Level Prediction and Resume & Job Description Matching Using Conte...
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...
 
The Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference ResolutionThe Myth of Higher-Order Inference in Coreference Resolution
The Myth of Higher-Order Inference in Coreference Resolution
 
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
Noise Pollution in Hospital Readmission Prediction: Long Document Classificat...
 
Abstract Meaning Representation
Abstract Meaning RepresentationAbstract Meaning Representation
Abstract Meaning Representation
 
Semantic Role Labeling
Semantic Role LabelingSemantic Role Labeling
Semantic Role Labeling
 
CKY Parsing
CKY ParsingCKY Parsing
CKY Parsing
 
CS329 - WordNet Similarities
CS329 - WordNet SimilaritiesCS329 - WordNet Similarities
CS329 - WordNet Similarities
 
CS329 - Lexical Relations
CS329 - Lexical RelationsCS329 - Lexical Relations
CS329 - Lexical Relations
 
Automatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue ManagementAutomatic Knowledge Base Expansion for Dialogue Management
Automatic Knowledge Base Expansion for Dialogue Management
 
Attention is All You Need for AMR Parsing
Attention is All You Need for AMR ParsingAttention is All You Need for AMR Parsing
Attention is All You Need for AMR Parsing
 
Graph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to DialogueGraph-to-Text Generation and its Applications to Dialogue
Graph-to-Text Generation and its Applications to Dialogue
 
Real-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue UnderstandingReal-time Coreference Resolution for Dialogue Understanding
Real-time Coreference Resolution for Dialogue Understanding
 
Topological Sort
Topological SortTopological Sort
Topological Sort
 
Tries - Put
Tries - PutTries - Put
Tries - Put
 
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's DiseaseMulti-modal Embedding Learning for Early Detection of Alzheimer's Disease
Multi-modal Embedding Learning for Early Detection of Alzheimer's Disease
 
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue ContextsBuilding Widely-Interpretable Semantic Networks for Dialogue Contexts
Building Widely-Interpretable Semantic Networks for Dialogue Contexts
 
How to make Emora talk about Sports Intelligently
How to make Emora talk about Sports IntelligentlyHow to make Emora talk about Sports Intelligently
How to make Emora talk about Sports Intelligently
 

Recently uploaded

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 

Recently uploaded (20)

Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 

Finding Different Types of Medical Conditions: From Data Generation to Automatic Classification

  • 1. Finding Different Types of Medical Conditions: From Data Generation to Automatic Classificationjj Nathan Artz1, Stephen Doogan1, Jinho D. Choi, PhD2 1Real Life Sciences, New York, NY; 2Emory University, Atlanta, GA rlsciences.com info@rlsciences.com quantitative.emory.edu choi@mathcs.emory.edu INTRODUCTION RESULTSAPPROACH IMPLICATIONS • A key challenge to real world outcomes research lies in analyzing medical conditions in unstructured data. Central to this is determining whether conditions are reported as symptoms of disease, treatment indications or side effects. • In analyzing texts from patient reporting systems (e.g., FAERS, social media), researchers often use medical dictionaries such as MedDRA to produce ranked frequency counts of medical conditions1; however, this approach fails to differentiate treatment indications and side effect conditions, which is a critical part of assessing treatment outcomes. • Automated classification of medical conditions can provide a deeper understanding of treatment outcomes and promote efficient real world data investigations across a variety of patient level data. • In our study, we collected 19,313 spontaneous patient reports about antibiotic treatment experiences from medical forums. We annotated 760 sentences containing different types of medical conditions, and classified 5,179 unique conditions into indications and side effects across the set of reports using natural language processing and machine learning techniques. • Each of the 760 annotated sentences were given a binary label indicating whether the condition played the role of an “indication” or “side effect” of the given treatment. Specific treatment and condition mentions in text were replaced by generic text labels (i.e. _TREATMENT, _CONDITION) to prevent overfitting to the antibiotic drug class. • We used an SVM classifier with 5-fold cross validation and averaged the outcomes of the folds to determine the F1 scores (see Figure 2). • We manually reviewed some of the most impactful features of the SVM to see which are most important when differentiating the two classes of conditions. • For conditions playing the role of indication, we see interesting features such as a Bigram(treatment, for) and syntactic_parent(condition, for) as in “on steroids for the inflammation”. For side effects, we observe features such as SyntacticParent(condition, make) or SyntacticParent(condition, give) as in “treatment made me sick” or “the drug gave me an allergic reaction” which heavily influence the decision boundary of the SVM. • We believe this is a feasible approach to differentiating the roles of medical conditions. This approach is suggested for use in spontaneous reporting systems where many of the conditions can be assumed to play side effect or indication roles. • Our study can enable researchers, regulators, health insurers and pharmaceutical companies to better monitor treatment risk by filtering out instances of indications and automatically identify side effects in spontaneous reports. • Our approach can provide significant time savings in reviewing spontaneous reports for the purpose of side effect identification, enabling untapped data to be leveraged for real world treatment outcomes research. • Our follow on work includes applying these models across doctor’s notes in electronic health records (EHRs), case record forms within pharmaceutical companies and signal detection across social media and published literature. Can a machine differentiate between side effects and indications in spontaneous patient reports? EXPERIMENT FEATURES F1 SCORE 1 78.23 84.26 83.30 2 3 Bag of Lemmas + Syntactic Features + Syntactic Roles Bag of Lemmas Bag of Lemmas + Syntactic Features Figure 2. F1 scores for three experiments using different feature sets REFERENCES 1. Gurulingappa H, Toldo L, Rajput AM, Kors JA, Taweel A, Tayrouz Y. Automatic detection of adverse events to predict drug label changes using text and data mining techniques. Phar. Drug Saf. 2013;22(11):1189-94. 2. Choi JD, McCallum A. Transition-based Dependency Parsing with Selectional Branching. ACL’13. 2013 1052-1062. Figure 1. An overview of the classification approach CONDITION DEPENDENCY GRAPHS Indication I am taking cipro for my infection I got another infection after I was put on ciproSide Effect “” “” TEXT taking taking RAW TEXT STATISTICAL MODEL CLASSIFICATION OUTPUT taketake havehave II IIciprocipro infectioninfectioninfectioninfection ciprocipro cause timetheme themeagent agent • Various combinations of syntactic and semantic features were extracted from dependency graphs generated by an NLP toolkit, ClearNLP2, and were used for generating a statistical model trained by support vector machines (SVM). This statistical model is used for classifying medical conditions in raw text. Figure 1 shows the overview of our approach. 1 1 1 1 2 2 2