SlideShare a Scribd company logo
Machine learning assisted citation screening for
Systematic Reviews
Anjani K. Dhrangadhariya et al.
MedGIFT group
University of Applied Sciences Western Switzerland (HES-SO)
Slides adapted from MIE2020 conference
1
Background
• Physicians moving away from
anecdotal evidence for treatment
• Shift towards Evidence-based
Medicine (EBM)
• Combining best available evidence
with clinical judgement
2
Background
• Evidence aggregation?
• Systematic reviews
• Example questions:
1. Will acupuncture combined with the drug modafinil reduce fatigue in adult
cancer patients?
2. Could Botox injections reduce abnormal muscle function in children aged 8-
14 years?
3. Will absorbable sutures be better than metal sutures during
abdominoplasty for older patients?
3
Background
Formulate clinical question
Search & Collect scientific studies
Manually screen the studies
Relevant studies
Irrelevant studies
• Google scholar
• PubMed
• EMBASE
• CINAHL
• PsychInfo
• Cochrane CENTRAL
• clinicaltrials.gov
• Grey literature
Include
Exclude
4
Further analysis
Motivation
Pros Cons
Systematic process Time consuming
• 12 – 24 months
Evidence from multiple sources Laboursome
• At least two doctors need to manually
screen the studies
• ~90,000 studies each
Reliable results High cost
• Extra salary for doctors!
Single
question!
5
• Automatic screening could ease doctors’ burden.
Motivation and Objective
Manually screen the studies
Relevant studies
Irrelevant studies
Include
Exclude
6
Manually
Data source
• Dataset from Hilfiker et al. (2017)
• 31’279 titles and abstracts
• Manually screened into
• Two classes (labels)
• 4’066 – Include
• 27’213 – Exclude
PubMed Central – Open Access: 2.09 Mi.
titles and abstracts to train word
embeddings.
1
2
Generate word embeddings
Explore screening automation PoC
I E
<
7
Methodology
1. Generate word embeddings
• Input: 2.09 Mi titles and
abstracts from PMC
• Output: word embeddings
• Further text preprocessing and
hyperparameter information in
the paper.
Text preprocessing
Phrase generation:
word2phrase
Word
tokenization
Unigrams bigrams
word2vec fastText
word2vec
embeddings
fastText
embeddings
1
8
Methodology
1. Screening automation
• Input: 31’279 titles and
abstracts
• Train-test split: 80-20%
Deduplication
Text preprocessing
No oversampling
Feature extraction
Classifier training and test
Random
oversampling
2
9
Input corpus: titles and abstracts
I E
<
• Logistic Regression
• Support Vector Machines
• K-Nearest Neighbor
• Decision Trees
• Random Forests
• Convolutional Neural
Networks
Results
Random
oversampling
No oversampling
10
Discussion
11
E I
Cosine similarity (centroid):
0.985814
Cosine similarity (centroid):
0.985824
Conclusion
• One of the first to explore such narrow topic (physiotherapy) citation
screening topic
• Using domain-specific word embeddings for citation screening
• Exploring the topic of
• Class imbalance
• Class overlap
• This problem disguises as classification problem but it is not.
12
E I
I E
<
Further reading
• Dhrangadhariya A, Hilfiker R, Schaer R, Müller H. “Machine Learning
Assisted Citation Screening for Systematic Reviews.” Stud Health
Technol Inform. 2020;270:302-306. doi:10.3233/SHTI200171
• Summary of papers in BioNLP workshop from ACL2020:
https://medium.com/@recurrent.pi/summarizing-papers-from-
bionlp-workshop-in-acl2020-a6ba3d937705
13
Thank you for your attention
Email anjani.dhrangadhariya@hevs.ch
LinkedIn https://www.linkedin.com/in/anjani-dhrangadhariya/
Twitter https://twitter.com/AsciiRandom
Medium https://medium.com/@recurrent.pi
14

More Related Content

What's hot

Cec2010 araujo pereziglesias
Cec2010 araujo pereziglesiasCec2010 araujo pereziglesias
Cec2010 araujo pereziglesias
Lourdes Araujo
 
Research Proposal
Research ProposalResearch Proposal
Research Proposal
Sadia Sharmin
 
Annotation examples--Fribourg--2019-09-03
Annotation examples--Fribourg--2019-09-03Annotation examples--Fribourg--2019-09-03
Annotation examples--Fribourg--2019-09-03
jodischneider
 
On the Measurement of Test Collection Reliability
On the Measurement of Test Collection ReliabilityOn the Measurement of Test Collection Reliability
On the Measurement of Test Collection Reliability
Julián Urbano
 
Metabolomic data analysis and visualization tools
Metabolomic data analysis and visualization toolsMetabolomic data analysis and visualization tools
Metabolomic data analysis and visualization tools
Dmitry Grapov
 
Modelling and Estimation
Modelling and EstimationModelling and Estimation
Modelling and Estimation
NIHR CLAHRC West Midlands
 
Using the Micropublications ontology and the Open Annotation Data Model to re...
Using the Micropublications ontology and the Open Annotation Data Model to re...Using the Micropublications ontology and the Open Annotation Data Model to re...
Using the Micropublications ontology and the Open Annotation Data Model to re...
jodischneider
 
Does Technology Acceptance Affect E-Learning in a Non-Technology Intensive Co...
Does Technology Acceptance Affect E-Learning in a Non-Technology Intensive Co...Does Technology Acceptance Affect E-Learning in a Non-Technology Intensive Co...
Does Technology Acceptance Affect E-Learning in a Non-Technology Intensive Co...
Dorea Hardy
 
Mapping to the Metabolomic Manifold
Mapping to the Metabolomic ManifoldMapping to the Metabolomic Manifold
Mapping to the Metabolomic Manifold
Dmitry Grapov
 
Empowering Data in Scholarly Publishing
Empowering Data in Scholarly PublishingEmpowering Data in Scholarly Publishing
Empowering Data in Scholarly Publishing
Charleston Conference
 
Shyam presentation prefinal
Shyam presentation prefinalShyam presentation prefinal
Shyam presentation prefinal
Shyam Raj
 
Standards for clinical research data - steps to an information model (CRIM).
Standards for clinical research data - steps to an information model (CRIM).Standards for clinical research data - steps to an information model (CRIM).
Standards for clinical research data - steps to an information model (CRIM).
Wolfgang Kuchinke
 
Thesis defense sample
Thesis defense sampleThesis defense sample
Thesis defense sample
Vijayananda Mohire
 
Principles and key responsibilities in RI, RDM, RIAs and their intersection
Principles and key responsibilities in RI, RDM, RIAs and their intersectionPrinciples and key responsibilities in RI, RDM, RIAs and their intersection
Principles and key responsibilities in RI, RDM, RIAs and their intersection
ARDC
 
Assessing searches in NICE Single Technology Appraisals: practice and checklist
Assessing searches in NICE Single Technology Appraisals: practice and checklistAssessing searches in NICE Single Technology Appraisals: practice and checklist
Assessing searches in NICE Single Technology Appraisals: practice and checklist
scharrlibrary
 
A Model of Decision Support System for Research Topic Selection and Plagiaris...
A Model of Decision Support System for Research Topic Selection and Plagiaris...A Model of Decision Support System for Research Topic Selection and Plagiaris...
A Model of Decision Support System for Research Topic Selection and Plagiaris...
theijes
 
Search in Medical Text
Search in Medical TextSearch in Medical Text
Search in Medical Text
Sarvnaz Karimi
 
Towards knowledge maintenance in scientific digital libraries with the keysto...
Towards knowledge maintenance in scientific digital libraries with the keysto...Towards knowledge maintenance in scientific digital libraries with the keysto...
Towards knowledge maintenance in scientific digital libraries with the keysto...
jodischneider
 

What's hot (18)

Cec2010 araujo pereziglesias
Cec2010 araujo pereziglesiasCec2010 araujo pereziglesias
Cec2010 araujo pereziglesias
 
Research Proposal
Research ProposalResearch Proposal
Research Proposal
 
Annotation examples--Fribourg--2019-09-03
Annotation examples--Fribourg--2019-09-03Annotation examples--Fribourg--2019-09-03
Annotation examples--Fribourg--2019-09-03
 
On the Measurement of Test Collection Reliability
On the Measurement of Test Collection ReliabilityOn the Measurement of Test Collection Reliability
On the Measurement of Test Collection Reliability
 
Metabolomic data analysis and visualization tools
Metabolomic data analysis and visualization toolsMetabolomic data analysis and visualization tools
Metabolomic data analysis and visualization tools
 
Modelling and Estimation
Modelling and EstimationModelling and Estimation
Modelling and Estimation
 
Using the Micropublications ontology and the Open Annotation Data Model to re...
Using the Micropublications ontology and the Open Annotation Data Model to re...Using the Micropublications ontology and the Open Annotation Data Model to re...
Using the Micropublications ontology and the Open Annotation Data Model to re...
 
Does Technology Acceptance Affect E-Learning in a Non-Technology Intensive Co...
Does Technology Acceptance Affect E-Learning in a Non-Technology Intensive Co...Does Technology Acceptance Affect E-Learning in a Non-Technology Intensive Co...
Does Technology Acceptance Affect E-Learning in a Non-Technology Intensive Co...
 
Mapping to the Metabolomic Manifold
Mapping to the Metabolomic ManifoldMapping to the Metabolomic Manifold
Mapping to the Metabolomic Manifold
 
Empowering Data in Scholarly Publishing
Empowering Data in Scholarly PublishingEmpowering Data in Scholarly Publishing
Empowering Data in Scholarly Publishing
 
Shyam presentation prefinal
Shyam presentation prefinalShyam presentation prefinal
Shyam presentation prefinal
 
Standards for clinical research data - steps to an information model (CRIM).
Standards for clinical research data - steps to an information model (CRIM).Standards for clinical research data - steps to an information model (CRIM).
Standards for clinical research data - steps to an information model (CRIM).
 
Thesis defense sample
Thesis defense sampleThesis defense sample
Thesis defense sample
 
Principles and key responsibilities in RI, RDM, RIAs and their intersection
Principles and key responsibilities in RI, RDM, RIAs and their intersectionPrinciples and key responsibilities in RI, RDM, RIAs and their intersection
Principles and key responsibilities in RI, RDM, RIAs and their intersection
 
Assessing searches in NICE Single Technology Appraisals: practice and checklist
Assessing searches in NICE Single Technology Appraisals: practice and checklistAssessing searches in NICE Single Technology Appraisals: practice and checklist
Assessing searches in NICE Single Technology Appraisals: practice and checklist
 
A Model of Decision Support System for Research Topic Selection and Plagiaris...
A Model of Decision Support System for Research Topic Selection and Plagiaris...A Model of Decision Support System for Research Topic Selection and Plagiaris...
A Model of Decision Support System for Research Topic Selection and Plagiaris...
 
Search in Medical Text
Search in Medical TextSearch in Medical Text
Search in Medical Text
 
Towards knowledge maintenance in scientific digital libraries with the keysto...
Towards knowledge maintenance in scientific digital libraries with the keysto...Towards knowledge maintenance in scientific digital libraries with the keysto...
Towards knowledge maintenance in scientific digital libraries with the keysto...
 

Similar to Machine Learning Assisted Citation Screening for Systematic Reviews

My PhD thesis presentation slides
My PhD thesis presentation slidesMy PhD thesis presentation slides
My PhD thesis presentation slides
Mattia Bosio
 
phd ppt2 sample reference download1.pptx
phd ppt2 sample reference download1.pptxphd ppt2 sample reference download1.pptx
phd ppt2 sample reference download1.pptx
ArumugamP26
 
Presentation s rs
Presentation s rsPresentation s rs
Presentation s rs
jnmueller
 
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking ConversationsAsking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
Mohammad Aliannejadi
 
Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...
Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...
Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...
NASIG
 
Clinical Anatomy 9566
Clinical Anatomy 9566Clinical Anatomy 9566
Clinical Anatomy 9566
Robin Featherstone
 
Elsevier Industry Talk - WSDM 2020
Elsevier Industry Talk - WSDM 2020Elsevier Industry Talk - WSDM 2020
Elsevier Industry Talk - WSDM 2020
Daniel Kershaw
 
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
Angelo Salatino
 
Systematic Literature Review and Meta-Analysis (1).pptx
Systematic Literature Review and Meta-Analysis (1).pptxSystematic Literature Review and Meta-Analysis (1).pptx
Systematic Literature Review and Meta-Analysis (1).pptx
EDCEILIYAHDENTALCARE
 
Levels of evidence, systematic review and guidelines
Levels of evidence, systematic review and  guidelinesLevels of evidence, systematic review and  guidelines
Levels of evidence, systematic review and guidelines
Aboubakr Elnashar
 
RESEARCH METHODLOGY final 28-2-16.pptx
RESEARCH METHODLOGY final 28-2-16.pptxRESEARCH METHODLOGY final 28-2-16.pptx
RESEARCH METHODLOGY final 28-2-16.pptx
riyazameer
 
Grds conferences icst and icbelsh (10)
Grds conferences icst and icbelsh (10)Grds conferences icst and icbelsh (10)
Grds conferences icst and icbelsh (10)
Global R & D Services
 
The research process steps
The research process stepsThe research process steps
The research process steps
Roger Watson
 
Sybrandt Thesis Proposal Presentation
Sybrandt Thesis Proposal PresentationSybrandt Thesis Proposal Presentation
Sybrandt Thesis Proposal Presentation
Justin Sybrandt, Ph.D.
 
Introduction to systematic reviews
Introduction to systematic reviewsIntroduction to systematic reviews
Introduction to systematic reviews
Omar Midani
 
Systematic Review & Meta-Analysis Course - Summary Slides
Systematic Review & Meta-Analysis Course - Summary SlidesSystematic Review & Meta-Analysis Course - Summary Slides
Systematic Review & Meta-Analysis Course - Summary Slides
King Abdullah Medical City (KAMC) Research Center
 
Systematic literature review technique.pptx
Systematic literature review technique.pptxSystematic literature review technique.pptx
Systematic literature review technique.pptx
TANMAY DAS GUPTA
 
Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1
Ann-Marie Roche
 
Research protocol & Systematic Review.docx
Research protocol & Systematic Review.docxResearch protocol & Systematic Review.docx
Research protocol & Systematic Review.docx
AmaraZahid
 
Computer based assessment of clinical reasoning (Heidelberg 2012)
Computer based assessment of clinical reasoning (Heidelberg 2012)Computer based assessment of clinical reasoning (Heidelberg 2012)
Computer based assessment of clinical reasoning (Heidelberg 2012)Mathijs Doets
 

Similar to Machine Learning Assisted Citation Screening for Systematic Reviews (20)

My PhD thesis presentation slides
My PhD thesis presentation slidesMy PhD thesis presentation slides
My PhD thesis presentation slides
 
phd ppt2 sample reference download1.pptx
phd ppt2 sample reference download1.pptxphd ppt2 sample reference download1.pptx
phd ppt2 sample reference download1.pptx
 
Presentation s rs
Presentation s rsPresentation s rs
Presentation s rs
 
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking ConversationsAsking Clarifying Questions in Open-Domain Information-Seeking Conversations
Asking Clarifying Questions in Open-Domain Information-Seeking Conversations
 
Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...
Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...
Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...
 
Clinical Anatomy 9566
Clinical Anatomy 9566Clinical Anatomy 9566
Clinical Anatomy 9566
 
Elsevier Industry Talk - WSDM 2020
Elsevier Industry Talk - WSDM 2020Elsevier Industry Talk - WSDM 2020
Elsevier Industry Talk - WSDM 2020
 
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
 
Systematic Literature Review and Meta-Analysis (1).pptx
Systematic Literature Review and Meta-Analysis (1).pptxSystematic Literature Review and Meta-Analysis (1).pptx
Systematic Literature Review and Meta-Analysis (1).pptx
 
Levels of evidence, systematic review and guidelines
Levels of evidence, systematic review and  guidelinesLevels of evidence, systematic review and  guidelines
Levels of evidence, systematic review and guidelines
 
RESEARCH METHODLOGY final 28-2-16.pptx
RESEARCH METHODLOGY final 28-2-16.pptxRESEARCH METHODLOGY final 28-2-16.pptx
RESEARCH METHODLOGY final 28-2-16.pptx
 
Grds conferences icst and icbelsh (10)
Grds conferences icst and icbelsh (10)Grds conferences icst and icbelsh (10)
Grds conferences icst and icbelsh (10)
 
The research process steps
The research process stepsThe research process steps
The research process steps
 
Sybrandt Thesis Proposal Presentation
Sybrandt Thesis Proposal PresentationSybrandt Thesis Proposal Presentation
Sybrandt Thesis Proposal Presentation
 
Introduction to systematic reviews
Introduction to systematic reviewsIntroduction to systematic reviews
Introduction to systematic reviews
 
Systematic Review & Meta-Analysis Course - Summary Slides
Systematic Review & Meta-Analysis Course - Summary SlidesSystematic Review & Meta-Analysis Course - Summary Slides
Systematic Review & Meta-Analysis Course - Summary Slides
 
Systematic literature review technique.pptx
Systematic literature review technique.pptxSystematic literature review technique.pptx
Systematic literature review technique.pptx
 
Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1
 
Research protocol & Systematic Review.docx
Research protocol & Systematic Review.docxResearch protocol & Systematic Review.docx
Research protocol & Systematic Review.docx
 
Computer based assessment of clinical reasoning (Heidelberg 2012)
Computer based assessment of clinical reasoning (Heidelberg 2012)Computer based assessment of clinical reasoning (Heidelberg 2012)
Computer based assessment of clinical reasoning (Heidelberg 2012)
 

More from Anjani Dhrangadhariya

Weakly supervised PICO information extraction using Snorkel
Weakly supervised PICO information extraction using SnorkelWeakly supervised PICO information extraction using Snorkel
Weakly supervised PICO information extraction using Snorkel
Anjani Dhrangadhariya
 
DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...
DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...
DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...
Anjani Dhrangadhariya
 
End-to-end Fine-grained Neural Entity Recognition of Patients, Interventions,...
End-to-end Fine-grained Neural Entity Recognition of Patients, Interventions,...End-to-end Fine-grained Neural Entity Recognition of Patients, Interventions,...
End-to-end Fine-grained Neural Entity Recognition of Patients, Interventions,...
Anjani Dhrangadhariya
 
Classification of prostate cancer pathology reports using natural language pr...
Classification of prostate cancer pathology reports using natural language pr...Classification of prostate cancer pathology reports using natural language pr...
Classification of prostate cancer pathology reports using natural language pr...
Anjani Dhrangadhariya
 
Exploiting biomedical literature to mine out a large multimodal dataset of ra...
Exploiting biomedical literature to mine out a large multimodal dataset of ra...Exploiting biomedical literature to mine out a large multimodal dataset of ra...
Exploiting biomedical literature to mine out a large multimodal dataset of ra...
Anjani Dhrangadhariya
 
Introduction to graph databases: Neo4j and Cypher
Introduction to graph databases: Neo4j and CypherIntroduction to graph databases: Neo4j and Cypher
Introduction to graph databases: Neo4j and Cypher
Anjani Dhrangadhariya
 

More from Anjani Dhrangadhariya (6)

Weakly supervised PICO information extraction using Snorkel
Weakly supervised PICO information extraction using SnorkelWeakly supervised PICO information extraction using Snorkel
Weakly supervised PICO information extraction using Snorkel
 
DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...
DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...
DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...
 
End-to-end Fine-grained Neural Entity Recognition of Patients, Interventions,...
End-to-end Fine-grained Neural Entity Recognition of Patients, Interventions,...End-to-end Fine-grained Neural Entity Recognition of Patients, Interventions,...
End-to-end Fine-grained Neural Entity Recognition of Patients, Interventions,...
 
Classification of prostate cancer pathology reports using natural language pr...
Classification of prostate cancer pathology reports using natural language pr...Classification of prostate cancer pathology reports using natural language pr...
Classification of prostate cancer pathology reports using natural language pr...
 
Exploiting biomedical literature to mine out a large multimodal dataset of ra...
Exploiting biomedical literature to mine out a large multimodal dataset of ra...Exploiting biomedical literature to mine out a large multimodal dataset of ra...
Exploiting biomedical literature to mine out a large multimodal dataset of ra...
 
Introduction to graph databases: Neo4j and Cypher
Introduction to graph databases: Neo4j and CypherIntroduction to graph databases: Neo4j and Cypher
Introduction to graph databases: Neo4j and Cypher
 

Recently uploaded

Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 

Recently uploaded (20)

Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 

Machine Learning Assisted Citation Screening for Systematic Reviews

  • 1. Machine learning assisted citation screening for Systematic Reviews Anjani K. Dhrangadhariya et al. MedGIFT group University of Applied Sciences Western Switzerland (HES-SO) Slides adapted from MIE2020 conference 1
  • 2. Background • Physicians moving away from anecdotal evidence for treatment • Shift towards Evidence-based Medicine (EBM) • Combining best available evidence with clinical judgement 2
  • 3. Background • Evidence aggregation? • Systematic reviews • Example questions: 1. Will acupuncture combined with the drug modafinil reduce fatigue in adult cancer patients? 2. Could Botox injections reduce abnormal muscle function in children aged 8- 14 years? 3. Will absorbable sutures be better than metal sutures during abdominoplasty for older patients? 3
  • 4. Background Formulate clinical question Search & Collect scientific studies Manually screen the studies Relevant studies Irrelevant studies • Google scholar • PubMed • EMBASE • CINAHL • PsychInfo • Cochrane CENTRAL • clinicaltrials.gov • Grey literature Include Exclude 4 Further analysis
  • 5. Motivation Pros Cons Systematic process Time consuming • 12 – 24 months Evidence from multiple sources Laboursome • At least two doctors need to manually screen the studies • ~90,000 studies each Reliable results High cost • Extra salary for doctors! Single question! 5
  • 6. • Automatic screening could ease doctors’ burden. Motivation and Objective Manually screen the studies Relevant studies Irrelevant studies Include Exclude 6 Manually
  • 7. Data source • Dataset from Hilfiker et al. (2017) • 31’279 titles and abstracts • Manually screened into • Two classes (labels) • 4’066 – Include • 27’213 – Exclude PubMed Central – Open Access: 2.09 Mi. titles and abstracts to train word embeddings. 1 2 Generate word embeddings Explore screening automation PoC I E < 7
  • 8. Methodology 1. Generate word embeddings • Input: 2.09 Mi titles and abstracts from PMC • Output: word embeddings • Further text preprocessing and hyperparameter information in the paper. Text preprocessing Phrase generation: word2phrase Word tokenization Unigrams bigrams word2vec fastText word2vec embeddings fastText embeddings 1 8
  • 9. Methodology 1. Screening automation • Input: 31’279 titles and abstracts • Train-test split: 80-20% Deduplication Text preprocessing No oversampling Feature extraction Classifier training and test Random oversampling 2 9 Input corpus: titles and abstracts I E < • Logistic Regression • Support Vector Machines • K-Nearest Neighbor • Decision Trees • Random Forests • Convolutional Neural Networks
  • 11. Discussion 11 E I Cosine similarity (centroid): 0.985814 Cosine similarity (centroid): 0.985824
  • 12. Conclusion • One of the first to explore such narrow topic (physiotherapy) citation screening topic • Using domain-specific word embeddings for citation screening • Exploring the topic of • Class imbalance • Class overlap • This problem disguises as classification problem but it is not. 12 E I I E <
  • 13. Further reading • Dhrangadhariya A, Hilfiker R, Schaer R, Müller H. “Machine Learning Assisted Citation Screening for Systematic Reviews.” Stud Health Technol Inform. 2020;270:302-306. doi:10.3233/SHTI200171 • Summary of papers in BioNLP workshop from ACL2020: https://medium.com/@recurrent.pi/summarizing-papers-from- bionlp-workshop-in-acl2020-a6ba3d937705 13
  • 14. Thank you for your attention Email anjani.dhrangadhariya@hevs.ch LinkedIn https://www.linkedin.com/in/anjani-dhrangadhariya/ Twitter https://twitter.com/AsciiRandom Medium https://medium.com/@recurrent.pi 14