SlideShare a Scribd company logo
Using machine learning to improve the
user experience in online health care
communities
Dr. Anja Pilz
June 25, 2018
Overview
1. Introduction
2. Content Based Recommendations
Latent Dirichlet Allocation
3. User Based Recommendations
Association Rule Learning
4. Ensemble Model
5. Conclusion & Outlook
Dr. Anja Pilz June 25, 2018 1 / 17
About DocCheck
Online medical community for health care professionals
• seek information in the medicine wiki Flexikon
• read the bi-weekly newsletter
• share and discuss medical images and videos
• buy medical products and supplies in the online shop
• exchange with peers: seek help or discuss cases
Dr. Anja Pilz June 25, 2018 2 / 17
Motivation
Diverse user groups with different intentions and interests
• student might want to learn anatomical topics in some order
• nephrologist has different focus of interest than a cardiologist
• pharmacist might prefer reading pharma-related news
Long term goal
• find most relevant and interesting assets for each group to
enable targeted mailing & feed personalization
Dr. Anja Pilz June 25, 2018 3 / 17
DocCheck Recommendation Engine
Provide related content for every asset
• Flexikon articles, pictures, videos,
shop products, and news
Diverse data types
• how is a text/video/picture/shop
product relevant?
Hybrid Model: content & user driven
• thematic relevance from text
• user preference from click journeys
Ensemble of two ML techniques
• Latent Dirichlet Allocation
• Association Rule Learning
Dr. Anja Pilz June 25, 2018 4 / 17
Content Based Recommendations
Why?
• Cold start problem: want to propose related content also for new
assets without observed interactions
How?
• Represent textual content of asset in a Bag-of-Words (BoW)
model
• Find relevant assets using some similarity function (clustering)
But!
• Curse of Dimensionality: high dimensional BoW-vectors ”all look
the same” at some point
• BoW model can’t handle synonymy or polysemy
• vectors for Mammakarzinom and Brustkrebs have no similarity
Dr. Anja Pilz June 25, 2018 5 / 17
Latent Dirichlet Allocation (LDA)
• LDA is a Bayesian probabilistic approach to topic modeling
• allows for low-dimensional, continuous representation of
documents
Generative model
• assumes a fixed number K of underlying (latent) topics in a
document collection
• each document is a mixture of topics and generated by picking a
distribution over the latent topics
• given this mixture, the topic of each word is chosen and, given
their topics, the words are generated
Dr. Anja Pilz June 25, 2018 6 / 17
Basic Idea of LDA
• you know stuff about 20 topics and want to write some text
• you decide on some of the topics you want to write about (bit of
sports, bit of politics)
• you need words to express yourself that are related to these
topics, e.g. a round object associated with sports
• you pick one, for instance "ball", and write it down
Dr. Anja Pilz June 25, 2018 7 / 17
Example: Topics from Flexikon & News
Topics generated by LDA are clusters of words that often co-occur
arzneimittel,
medikament, prä-
parat, apotheker,
tablette, arzt,
einnahme, verord-
nung, compliance
drugs
enzym, biochemie,
aktivität, hem-
mung, substrat,
reaktion, inhibitor,
spaltung, protease
enzymes
Dr. Anja Pilz June 25, 2018 8 / 17
Example: Topics from Flexikon & News
Topics generated by LDA are clusters of words that often co-occur
auge, augenheilkunde,
netzhaut, cornea,
hornhaut, linse,
glaukom, retina, iris
eyes
herz, kardiologie,
herzinsuffizienz, ekg,
kardiomyopathie,
herzmuskelzelle,
herzfrequenz
heart
Dr. Anja Pilz June 25, 2018 8 / 17
Example: Topics from Flexikon & News
Topics generated by LDA are clusters of words that often co-occur
impfung, impfstoff,
masern, immu-
nisierung, vakzine,
schutz, röteln,
antikörper, polio
vaccine
ernährung, fleisch,
nahrungsmittel,
gemüse, nahrung,
diät, obst, zucker,
lebensmittel
diet
Dr. Anja Pilz June 25, 2018 8 / 17
Example: Renal Failure
• prominent topics: urea excretion,
kidneys (and more...)
urin, kreatinin,
clearance, nieren-
funktion, gfr,
niere, harnstoff,
niereninsuffizienz
urea excretion
niere, dialyse,
glomerulonephritis,
nephrologie,
proteinurie,
nierenversagen
kidneys
• inferred topics: topic probability
distribution with peaks at most
prominent topics e.g.
p(urea excretion) = 0.4,
p(kidneys) = 0.3, ...
Dr. Anja Pilz June 25, 2018 9 / 17
LDA Workflow
Training
• fetch corpus: content of all Flexikon and News articles
• do some preprocessing
• remove stopwords
• keep only ”medical terms” (MeSH), Named Entities, nouns, ...
• pump the documents into mallet & train the model
• run inference on all documents & store individual topic
distributions per asset
Dr. Anja Pilz June 25, 2018 10 / 17
Finding Thematically Related Content
New assets
• first apply the trained model
to infer and store the topic
distribution
Determine relevant links
• fetch stored distribution
• find similar topic
distributions using some
similarity measure
• e.g. Kullback-Leibler
Divergence of topic
distributions
Dr. Anja Pilz June 25, 2018 11 / 17
Association Rule Learning
Basket Analysis
• given the items in a basket, what other items is someone likely
to buy?
DocCheck
• given the clicks in a session, which other links is a user likely to
click?
Motivation
• clicks give direct feedback: click on a link can be assumed as
"this is relevant to me"
• no need to find relatedness measure for pictures and texts
Dr. Anja Pilz June 25, 2018 12 / 17
Association Rule Learning
Identify rules in a database using some measure of confidence
• database is the collection of all user journeys
• each rule X ⇒ Y is composed by two itemsets X and Y
• instead of items in a basket, we use the set of clicks in a session,
i.e. {url1, url2, ...}, to learn rules
• confidence: derived from the proportion of sessions that contain
X and Y
Dr. Anja Pilz June 25, 2018 13 / 17
Association Rule Learning Workflow
Training
• split user journeys into sessions and form frequent itemsets
• learn association rules from these itemsets
• store learned rules together with their weight, e.g.
X = {url1, url2}, Y = {url3}, conf (X ⇒ Y ) = 0.9
Application
• new assets:
_
_(
")
)_/
_
• known asset
• fetch all rules containing current asset (URL)
• based on the associated confidence, combine their URLs into set
of recommended links
Dr. Anja Pilz June 25, 2018 14 / 17
Ensemble Model
Why?
• avoid cold-start problem: provide high quality recommendations
both for new and known assets
• prioritize ”labeled” data from user sessions
How?
• ask both models for a prediction
• combine the result in a weighted way, give user driven model
(AR) some boost
Reinforcement learning: which model returns better predictions?
• track which predictions are being clicked
• evaluate prevalence & update weights
Dr. Anja Pilz June 25, 2018 15 / 17
Conclusion & Outlook
Combine content based and user generated data
• avoid cold-start problematic through content based model (LDA)
• adjust to user behavior through click journeys (AR)
• requires initial fine-tuning but few maintenance work
Next steps: enhanced retrieval for related pictures and videos
• if image/video has no description or interaction:
_
_(
")
)_/
_
• use image or video analysis tools (work in progress...)
Dr. Anja Pilz June 25, 2018 16 / 17
Thanks!
Dr. Anja Pilz June 25, 2018 17 / 17

More Related Content

What's hot

Non academic clinical researcher
Non academic clinical researcherNon academic clinical researcher
Non academic clinical researcher
Shantanu Patil
 
Finding Empirical Evidence, C: Guidelines and Protocols
Finding Empirical Evidence,  C: Guidelines and Protocols Finding Empirical Evidence,  C: Guidelines and Protocols
Finding Empirical Evidence, C: Guidelines and Protocols
Lucia Ravi
 
VET0100 The library and information (2021)
VET0100 The library and information (2021)VET0100 The library and information (2021)
VET0100 The library and information (2021)
Middlesex University
 
NISO Apr 29 Virtual Conference: Definitions for appropriate metrics and calcu...
NISO Apr 29 Virtual Conference: Definitions for appropriate metrics and calcu...NISO Apr 29 Virtual Conference: Definitions for appropriate metrics and calcu...
NISO Apr 29 Virtual Conference: Definitions for appropriate metrics and calcu...
National Information Standards Organization (NISO)
 
Law Research Visibility Workshop 3
Law Research Visibility Workshop 3Law Research Visibility Workshop 3
Law Research Visibility Workshop 3
Elizabeth Moll-Willard
 
Using Library Resources for your Dissertation
Using Library Resources for your DissertationUsing Library Resources for your Dissertation
Using Library Resources for your Dissertation
Gaz Johnson
 
EUA questionnaire on Open Access: 2016/17 Survey Results
EUA questionnaire on Open Access: 2016/17 Survey Results EUA questionnaire on Open Access: 2016/17 Survey Results
EUA questionnaire on Open Access: 2016/17 Survey Results
European University Association
 
Workshop on educating the workshop for openEHR implementation at Medinfo 2015
Workshop on educating the workshop for openEHR implementation at Medinfo 2015Workshop on educating the workshop for openEHR implementation at Medinfo 2015
Workshop on educating the workshop for openEHR implementation at Medinfo 2015
Silje Ljosland Bakke
 
NISO Apr 29 Virtual Conference: Development of specific definitions for alter...
NISO Apr 29 Virtual Conference: Development of specific definitions for alter...NISO Apr 29 Virtual Conference: Development of specific definitions for alter...
NISO Apr 29 Virtual Conference: Development of specific definitions for alter...
National Information Standards Organization (NISO)
 
Finding Empirical Evidence, A: Grey Literature
Finding Empirical Evidence, A: Grey LiteratureFinding Empirical Evidence, A: Grey Literature
Finding Empirical Evidence, A: Grey Literature
Lucia Ravi
 
National governance of archetypes in Norway
National governance of archetypes in NorwayNational governance of archetypes in Norway
National governance of archetypes in Norway
Silje Ljosland Bakke
 
MSc transneuro & gastro 2013-14
MSc transneuro & gastro 2013-14MSc transneuro & gastro 2013-14
MSc transneuro & gastro 2013-14
PaulaFunnell
 
Dr Julian O'kelly
Dr Julian O'kellyDr Julian O'kelly
Dr Julian O'kelly
Catherine Elliott
 
ANDS Webinar. Data Management Policies and People
ANDS Webinar. Data Management Policies and PeopleANDS Webinar. Data Management Policies and People
ANDS Webinar. Data Management Policies and People
Julia Gross
 
Topic what specific health conditions increase the risk o
Topic         what specific health conditions increase the risk oTopic         what specific health conditions increase the risk o
Topic what specific health conditions increase the risk o
raju957290
 
Through the keyhole: why do we need research on research?
Through the keyhole: why do we need research on research?Through the keyhole: why do we need research on research?
Through the keyhole: why do we need research on research?
Emma Kirkpatrick
 
Professor Pip Logan
Professor Pip LoganProfessor Pip Logan
Professor Pip Logan
Catherine Elliott
 
Research Information, Networking, and Collaboration - strategy and challenges
Research Information, Networking, and Collaboration - strategy and challengesResearch Information, Networking, and Collaboration - strategy and challenges
Research Information, Networking, and Collaboration - strategy and challenges
Library_Connect
 
Data analytics in Healthcare
Data analytics in HealthcareData analytics in Healthcare
Data analytics in Healthcare
Jorge A. Gaspar
 
Practical challenges for researchers in data sharing
Practical challenges for researchers in data sharingPractical challenges for researchers in data sharing
Practical challenges for researchers in data sharing
Varsha Khodiyar
 

What's hot (20)

Non academic clinical researcher
Non academic clinical researcherNon academic clinical researcher
Non academic clinical researcher
 
Finding Empirical Evidence, C: Guidelines and Protocols
Finding Empirical Evidence,  C: Guidelines and Protocols Finding Empirical Evidence,  C: Guidelines and Protocols
Finding Empirical Evidence, C: Guidelines and Protocols
 
VET0100 The library and information (2021)
VET0100 The library and information (2021)VET0100 The library and information (2021)
VET0100 The library and information (2021)
 
NISO Apr 29 Virtual Conference: Definitions for appropriate metrics and calcu...
NISO Apr 29 Virtual Conference: Definitions for appropriate metrics and calcu...NISO Apr 29 Virtual Conference: Definitions for appropriate metrics and calcu...
NISO Apr 29 Virtual Conference: Definitions for appropriate metrics and calcu...
 
Law Research Visibility Workshop 3
Law Research Visibility Workshop 3Law Research Visibility Workshop 3
Law Research Visibility Workshop 3
 
Using Library Resources for your Dissertation
Using Library Resources for your DissertationUsing Library Resources for your Dissertation
Using Library Resources for your Dissertation
 
EUA questionnaire on Open Access: 2016/17 Survey Results
EUA questionnaire on Open Access: 2016/17 Survey Results EUA questionnaire on Open Access: 2016/17 Survey Results
EUA questionnaire on Open Access: 2016/17 Survey Results
 
Workshop on educating the workshop for openEHR implementation at Medinfo 2015
Workshop on educating the workshop for openEHR implementation at Medinfo 2015Workshop on educating the workshop for openEHR implementation at Medinfo 2015
Workshop on educating the workshop for openEHR implementation at Medinfo 2015
 
NISO Apr 29 Virtual Conference: Development of specific definitions for alter...
NISO Apr 29 Virtual Conference: Development of specific definitions for alter...NISO Apr 29 Virtual Conference: Development of specific definitions for alter...
NISO Apr 29 Virtual Conference: Development of specific definitions for alter...
 
Finding Empirical Evidence, A: Grey Literature
Finding Empirical Evidence, A: Grey LiteratureFinding Empirical Evidence, A: Grey Literature
Finding Empirical Evidence, A: Grey Literature
 
National governance of archetypes in Norway
National governance of archetypes in NorwayNational governance of archetypes in Norway
National governance of archetypes in Norway
 
MSc transneuro & gastro 2013-14
MSc transneuro & gastro 2013-14MSc transneuro & gastro 2013-14
MSc transneuro & gastro 2013-14
 
Dr Julian O'kelly
Dr Julian O'kellyDr Julian O'kelly
Dr Julian O'kelly
 
ANDS Webinar. Data Management Policies and People
ANDS Webinar. Data Management Policies and PeopleANDS Webinar. Data Management Policies and People
ANDS Webinar. Data Management Policies and People
 
Topic what specific health conditions increase the risk o
Topic         what specific health conditions increase the risk oTopic         what specific health conditions increase the risk o
Topic what specific health conditions increase the risk o
 
Through the keyhole: why do we need research on research?
Through the keyhole: why do we need research on research?Through the keyhole: why do we need research on research?
Through the keyhole: why do we need research on research?
 
Professor Pip Logan
Professor Pip LoganProfessor Pip Logan
Professor Pip Logan
 
Research Information, Networking, and Collaboration - strategy and challenges
Research Information, Networking, and Collaboration - strategy and challengesResearch Information, Networking, and Collaboration - strategy and challenges
Research Information, Networking, and Collaboration - strategy and challenges
 
Data analytics in Healthcare
Data analytics in HealthcareData analytics in Healthcare
Data analytics in Healthcare
 
Practical challenges for researchers in data sharing
Practical challenges for researchers in data sharingPractical challenges for researchers in data sharing
Practical challenges for researchers in data sharing
 

Similar to Using machine learning to improve the user experience in online health care communities

Paolo Budroni at COAR Annual Meeting
Paolo Budroni at COAR Annual MeetingPaolo Budroni at COAR Annual Meeting
Paolo Budroni at COAR Annual Meeting
LEARN Project
 
Formulating a literature review
Formulating a literature reviewFormulating a literature review
Formulating a literature review
Lynn Hendricks
 
How to Conduct a Literature Review
How to Conduct a Literature ReviewHow to Conduct a Literature Review
How to Conduct a Literature Review
Robin Featherstone
 
Eisenhower Medical Center Evidence Based Practice 7/8/2014
Eisenhower Medical Center Evidence Based Practice 7/8/2014Eisenhower Medical Center Evidence Based Practice 7/8/2014
Eisenhower Medical Center Evidence Based Practice 7/8/2014
re_johns
 
Lecture 1 Introduction to Nx Research (1)(1).pptx
Lecture 1 Introduction to Nx Research (1)(1).pptxLecture 1 Introduction to Nx Research (1)(1).pptx
Lecture 1 Introduction to Nx Research (1)(1).pptx
AbdallahAlasal1
 
AT 501 7 8 2016
AT 501 7 8 2016AT 501 7 8 2016
AT 501 7 8 2016
Linda Galloway
 
Evidence based research
Evidence based research Evidence based research
Evidence based research
Smriti Arora
 
NURS201 - May 2013
NURS201 - May 2013 NURS201 - May 2013
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
National Information Standards Organization (NISO)
 
Incentives for modern research
Incentives for modern researchIncentives for modern research
Incentives for modern research
Jisc
 
Journal Club Evolution-HLG Jan 2022
Journal Club Evolution-HLG Jan 2022Journal Club Evolution-HLG Jan 2022
Journal Club Evolution-HLG Jan 2022
FranHarkness1
 
Boosting Research Productivity
Boosting Research ProductivityBoosting Research Productivity
Boosting Research Productivity
Nicole Capdarest-Arest
 
Clinical Research Informatics Year-in-Review
Clinical Research Informatics Year-in-ReviewClinical Research Informatics Year-in-Review
Clinical Research Informatics Year-in-Review
Peter Embi
 
Academic literature review
Academic literature review Academic literature review
Academic literature review
Harris Abd Hamid
 
Nursing Leadership Institute Presentation
Nursing Leadership Institute Presentation Nursing Leadership Institute Presentation
Nursing Leadership Institute Presentation
Virginia Commonwealth University
 
Steffen Frederiksen: DATA, DITA, DOCX
Steffen Frederiksen: DATA, DITA, DOCXSteffen Frederiksen: DATA, DITA, DOCX
Steffen Frederiksen: DATA, DITA, DOCX
Jack Molisani
 
Oral Health Research (lecture 2)
Oral Health Research (lecture 2)Oral Health Research (lecture 2)
Oral Health Research (lecture 2)
Martin Morris
 
VET2703 literature searching 2016
VET2703 literature searching 2016VET2703 literature searching 2016
VET2703 literature searching 2016
JoWilson13
 
Evidence based practice in application
Evidence based practice in applicationEvidence based practice in application
Evidence based practice in application
Ahmad Amirdash
 
Evidence based practice in application
Evidence based practice in applicationEvidence based practice in application
Evidence based practice in application
Ahmad Amirdash
 

Similar to Using machine learning to improve the user experience in online health care communities (20)

Paolo Budroni at COAR Annual Meeting
Paolo Budroni at COAR Annual MeetingPaolo Budroni at COAR Annual Meeting
Paolo Budroni at COAR Annual Meeting
 
Formulating a literature review
Formulating a literature reviewFormulating a literature review
Formulating a literature review
 
How to Conduct a Literature Review
How to Conduct a Literature ReviewHow to Conduct a Literature Review
How to Conduct a Literature Review
 
Eisenhower Medical Center Evidence Based Practice 7/8/2014
Eisenhower Medical Center Evidence Based Practice 7/8/2014Eisenhower Medical Center Evidence Based Practice 7/8/2014
Eisenhower Medical Center Evidence Based Practice 7/8/2014
 
Lecture 1 Introduction to Nx Research (1)(1).pptx
Lecture 1 Introduction to Nx Research (1)(1).pptxLecture 1 Introduction to Nx Research (1)(1).pptx
Lecture 1 Introduction to Nx Research (1)(1).pptx
 
AT 501 7 8 2016
AT 501 7 8 2016AT 501 7 8 2016
AT 501 7 8 2016
 
Evidence based research
Evidence based research Evidence based research
Evidence based research
 
NURS201 - May 2013
NURS201 - May 2013 NURS201 - May 2013
NURS201 - May 2013
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
 
Incentives for modern research
Incentives for modern researchIncentives for modern research
Incentives for modern research
 
Journal Club Evolution-HLG Jan 2022
Journal Club Evolution-HLG Jan 2022Journal Club Evolution-HLG Jan 2022
Journal Club Evolution-HLG Jan 2022
 
Boosting Research Productivity
Boosting Research ProductivityBoosting Research Productivity
Boosting Research Productivity
 
Clinical Research Informatics Year-in-Review
Clinical Research Informatics Year-in-ReviewClinical Research Informatics Year-in-Review
Clinical Research Informatics Year-in-Review
 
Academic literature review
Academic literature review Academic literature review
Academic literature review
 
Nursing Leadership Institute Presentation
Nursing Leadership Institute Presentation Nursing Leadership Institute Presentation
Nursing Leadership Institute Presentation
 
Steffen Frederiksen: DATA, DITA, DOCX
Steffen Frederiksen: DATA, DITA, DOCXSteffen Frederiksen: DATA, DITA, DOCX
Steffen Frederiksen: DATA, DITA, DOCX
 
Oral Health Research (lecture 2)
Oral Health Research (lecture 2)Oral Health Research (lecture 2)
Oral Health Research (lecture 2)
 
VET2703 literature searching 2016
VET2703 literature searching 2016VET2703 literature searching 2016
VET2703 literature searching 2016
 
Evidence based practice in application
Evidence based practice in applicationEvidence based practice in application
Evidence based practice in application
 
Evidence based practice in application
Evidence based practice in applicationEvidence based practice in application
Evidence based practice in application
 

More from Anja Pilz

Können Large Language Models helfen, meinen Patienten zu verstehen?
Können Large Language Models helfen, meinen Patienten zu verstehen?Können Large Language Models helfen, meinen Patienten zu verstehen?
Können Large Language Models helfen, meinen Patienten zu verstehen?
Anja Pilz
 
Natural Language Processing for Medical Data
Natural Language Processing for Medical DataNatural Language Processing for Medical Data
Natural Language Processing for Medical Data
Anja Pilz
 
Entity Linking to Wikipedia
Entity Linking to WikipediaEntity Linking to Wikipedia
Entity Linking to Wikipedia
Anja Pilz
 
Biomedical Entity Linking - Introduction, approaches, challenges
Biomedical Entity Linking - Introduction, approaches, challengesBiomedical Entity Linking - Introduction, approaches, challenges
Biomedical Entity Linking - Introduction, approaches, challenges
Anja Pilz
 
A career path in Data Science
A career path in Data ScienceA career path in Data Science
A career path in Data Science
Anja Pilz
 
A Case for automated Tests
A Case for automated TestsA Case for automated Tests
A Case for automated Tests
Anja Pilz
 

More from Anja Pilz (6)

Können Large Language Models helfen, meinen Patienten zu verstehen?
Können Large Language Models helfen, meinen Patienten zu verstehen?Können Large Language Models helfen, meinen Patienten zu verstehen?
Können Large Language Models helfen, meinen Patienten zu verstehen?
 
Natural Language Processing for Medical Data
Natural Language Processing for Medical DataNatural Language Processing for Medical Data
Natural Language Processing for Medical Data
 
Entity Linking to Wikipedia
Entity Linking to WikipediaEntity Linking to Wikipedia
Entity Linking to Wikipedia
 
Biomedical Entity Linking - Introduction, approaches, challenges
Biomedical Entity Linking - Introduction, approaches, challengesBiomedical Entity Linking - Introduction, approaches, challenges
Biomedical Entity Linking - Introduction, approaches, challenges
 
A career path in Data Science
A career path in Data ScienceA career path in Data Science
A career path in Data Science
 
A Case for automated Tests
A Case for automated TestsA Case for automated Tests
A Case for automated Tests
 

Recently uploaded

Senior Software Profiles Backend Sample - Sheet1.pdf
Senior Software Profiles  Backend Sample - Sheet1.pdfSenior Software Profiles  Backend Sample - Sheet1.pdf
Senior Software Profiles Backend Sample - Sheet1.pdf
Vineet
 
SAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content DocumentSAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content Document
newdirectionconsulta
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
Timothy Spann
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
9gr6pty
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
Vietnam Cotton & Spinning Association
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
dataschool1
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
eudsoh
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
Template xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptxTemplate xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptx
TeukuEriSyahputra
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdfNamma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
22ad0301
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
aguty
 
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
exukyp
 
Senior Engineering Sample EM DOE - Sheet1.pdf
Senior Engineering Sample EM DOE  - Sheet1.pdfSenior Engineering Sample EM DOE  - Sheet1.pdf
Senior Engineering Sample EM DOE - Sheet1.pdf
Vineet
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
Vineet
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
perranet1
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
zsafxbf
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
ywqeos
 

Recently uploaded (20)

Senior Software Profiles Backend Sample - Sheet1.pdf
Senior Software Profiles  Backend Sample - Sheet1.pdfSenior Software Profiles  Backend Sample - Sheet1.pdf
Senior Software Profiles Backend Sample - Sheet1.pdf
 
SAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content DocumentSAP BW4HANA Implementagtion Content Document
SAP BW4HANA Implementagtion Content Document
 
06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus06-18-2024-Princeton Meetup-Introduction to Milvus
06-18-2024-Princeton Meetup-Introduction to Milvus
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
一比一原版(uob毕业证书)伯明翰大学毕业证如何办理
 
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
[VCOSA] Monthly Report - Cotton & Yarn Statistics May 2024
 
A gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented GenerationA gentle exploration of Retrieval Augmented Generation
A gentle exploration of Retrieval Augmented Generation
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
Template xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptxTemplate xxxxxxxx ssssssssssss Sertifikat.pptx
Template xxxxxxxx ssssssssssss Sertifikat.pptx
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdfNamma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
Namma-Kalvi-11th-Physics-Study-Material-Unit-1-EM-221086.pdf
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
一比一原版澳洲西澳大学毕业证(uwa毕业证书)如何办理
 
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理一比一原版(UofT毕业证)多伦多大学毕业证如何办理
一比一原版(UofT毕业证)多伦多大学毕业证如何办理
 
Senior Engineering Sample EM DOE - Sheet1.pdf
Senior Engineering Sample EM DOE  - Sheet1.pdfSenior Engineering Sample EM DOE  - Sheet1.pdf
Senior Engineering Sample EM DOE - Sheet1.pdf
 
Data Scientist Machine Learning Profiles .pdf
Data Scientist Machine Learning  Profiles .pdfData Scientist Machine Learning  Profiles .pdf
Data Scientist Machine Learning Profiles .pdf
 
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdfreading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
reading_sample_sap_press_operational_data_provisioning_with_sap_bw4hana (1).pdf
 
一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理一比一原版莱斯大学毕业证(rice毕业证)如何办理
一比一原版莱斯大学毕业证(rice毕业证)如何办理
 
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
一比一原版(lbs毕业证书)伦敦商学院毕业证如何办理
 

Using machine learning to improve the user experience in online health care communities

  • 1. Using machine learning to improve the user experience in online health care communities Dr. Anja Pilz June 25, 2018
  • 2. Overview 1. Introduction 2. Content Based Recommendations Latent Dirichlet Allocation 3. User Based Recommendations Association Rule Learning 4. Ensemble Model 5. Conclusion & Outlook Dr. Anja Pilz June 25, 2018 1 / 17
  • 3. About DocCheck Online medical community for health care professionals • seek information in the medicine wiki Flexikon • read the bi-weekly newsletter • share and discuss medical images and videos • buy medical products and supplies in the online shop • exchange with peers: seek help or discuss cases Dr. Anja Pilz June 25, 2018 2 / 17
  • 4. Motivation Diverse user groups with different intentions and interests • student might want to learn anatomical topics in some order • nephrologist has different focus of interest than a cardiologist • pharmacist might prefer reading pharma-related news Long term goal • find most relevant and interesting assets for each group to enable targeted mailing & feed personalization Dr. Anja Pilz June 25, 2018 3 / 17
  • 5. DocCheck Recommendation Engine Provide related content for every asset • Flexikon articles, pictures, videos, shop products, and news Diverse data types • how is a text/video/picture/shop product relevant? Hybrid Model: content & user driven • thematic relevance from text • user preference from click journeys Ensemble of two ML techniques • Latent Dirichlet Allocation • Association Rule Learning Dr. Anja Pilz June 25, 2018 4 / 17
  • 6. Content Based Recommendations Why? • Cold start problem: want to propose related content also for new assets without observed interactions How? • Represent textual content of asset in a Bag-of-Words (BoW) model • Find relevant assets using some similarity function (clustering) But! • Curse of Dimensionality: high dimensional BoW-vectors ”all look the same” at some point • BoW model can’t handle synonymy or polysemy • vectors for Mammakarzinom and Brustkrebs have no similarity Dr. Anja Pilz June 25, 2018 5 / 17
  • 7. Latent Dirichlet Allocation (LDA) • LDA is a Bayesian probabilistic approach to topic modeling • allows for low-dimensional, continuous representation of documents Generative model • assumes a fixed number K of underlying (latent) topics in a document collection • each document is a mixture of topics and generated by picking a distribution over the latent topics • given this mixture, the topic of each word is chosen and, given their topics, the words are generated Dr. Anja Pilz June 25, 2018 6 / 17
  • 8. Basic Idea of LDA • you know stuff about 20 topics and want to write some text • you decide on some of the topics you want to write about (bit of sports, bit of politics) • you need words to express yourself that are related to these topics, e.g. a round object associated with sports • you pick one, for instance "ball", and write it down Dr. Anja Pilz June 25, 2018 7 / 17
  • 9. Example: Topics from Flexikon & News Topics generated by LDA are clusters of words that often co-occur arzneimittel, medikament, prä- parat, apotheker, tablette, arzt, einnahme, verord- nung, compliance drugs enzym, biochemie, aktivität, hem- mung, substrat, reaktion, inhibitor, spaltung, protease enzymes Dr. Anja Pilz June 25, 2018 8 / 17
  • 10. Example: Topics from Flexikon & News Topics generated by LDA are clusters of words that often co-occur auge, augenheilkunde, netzhaut, cornea, hornhaut, linse, glaukom, retina, iris eyes herz, kardiologie, herzinsuffizienz, ekg, kardiomyopathie, herzmuskelzelle, herzfrequenz heart Dr. Anja Pilz June 25, 2018 8 / 17
  • 11. Example: Topics from Flexikon & News Topics generated by LDA are clusters of words that often co-occur impfung, impfstoff, masern, immu- nisierung, vakzine, schutz, röteln, antikörper, polio vaccine ernährung, fleisch, nahrungsmittel, gemüse, nahrung, diät, obst, zucker, lebensmittel diet Dr. Anja Pilz June 25, 2018 8 / 17
  • 12. Example: Renal Failure • prominent topics: urea excretion, kidneys (and more...) urin, kreatinin, clearance, nieren- funktion, gfr, niere, harnstoff, niereninsuffizienz urea excretion niere, dialyse, glomerulonephritis, nephrologie, proteinurie, nierenversagen kidneys • inferred topics: topic probability distribution with peaks at most prominent topics e.g. p(urea excretion) = 0.4, p(kidneys) = 0.3, ... Dr. Anja Pilz June 25, 2018 9 / 17
  • 13. LDA Workflow Training • fetch corpus: content of all Flexikon and News articles • do some preprocessing • remove stopwords • keep only ”medical terms” (MeSH), Named Entities, nouns, ... • pump the documents into mallet & train the model • run inference on all documents & store individual topic distributions per asset Dr. Anja Pilz June 25, 2018 10 / 17
  • 14. Finding Thematically Related Content New assets • first apply the trained model to infer and store the topic distribution Determine relevant links • fetch stored distribution • find similar topic distributions using some similarity measure • e.g. Kullback-Leibler Divergence of topic distributions Dr. Anja Pilz June 25, 2018 11 / 17
  • 15. Association Rule Learning Basket Analysis • given the items in a basket, what other items is someone likely to buy? DocCheck • given the clicks in a session, which other links is a user likely to click? Motivation • clicks give direct feedback: click on a link can be assumed as "this is relevant to me" • no need to find relatedness measure for pictures and texts Dr. Anja Pilz June 25, 2018 12 / 17
  • 16. Association Rule Learning Identify rules in a database using some measure of confidence • database is the collection of all user journeys • each rule X ⇒ Y is composed by two itemsets X and Y • instead of items in a basket, we use the set of clicks in a session, i.e. {url1, url2, ...}, to learn rules • confidence: derived from the proportion of sessions that contain X and Y Dr. Anja Pilz June 25, 2018 13 / 17
  • 17. Association Rule Learning Workflow Training • split user journeys into sessions and form frequent itemsets • learn association rules from these itemsets • store learned rules together with their weight, e.g. X = {url1, url2}, Y = {url3}, conf (X ⇒ Y ) = 0.9 Application • new assets: _ _( ") )_/ _ • known asset • fetch all rules containing current asset (URL) • based on the associated confidence, combine their URLs into set of recommended links Dr. Anja Pilz June 25, 2018 14 / 17
  • 18. Ensemble Model Why? • avoid cold-start problem: provide high quality recommendations both for new and known assets • prioritize ”labeled” data from user sessions How? • ask both models for a prediction • combine the result in a weighted way, give user driven model (AR) some boost Reinforcement learning: which model returns better predictions? • track which predictions are being clicked • evaluate prevalence & update weights Dr. Anja Pilz June 25, 2018 15 / 17
  • 19. Conclusion & Outlook Combine content based and user generated data • avoid cold-start problematic through content based model (LDA) • adjust to user behavior through click journeys (AR) • requires initial fine-tuning but few maintenance work Next steps: enhanced retrieval for related pictures and videos • if image/video has no description or interaction: _ _( ") )_/ _ • use image or video analysis tools (work in progress...) Dr. Anja Pilz June 25, 2018 16 / 17
  • 20. Thanks! Dr. Anja Pilz June 25, 2018 17 / 17