SlideShare a Scribd company logo
1 of 57
Medical Information Retrieval and
its Evaluation: an Overview of CLEF
eHealth Evaluation Task
Lorraine Goeuriot
LIG – Université Grenoble Alpes (France)
lorraine.goeuriot@imag.fr
Presentation Overview
• Medical IR and its Evaluation
• CLEF eHealth
– Context and tasks
– IR tasks description
– Datasets
– Evaluation
– Participation
• Conclusion
2
Presentation Overview
• Medical IR and its Evaluation
• CLEF eHealth
– Context and tasks
– IR tasks description
– Datasets
– Evaluation
– Participation
• Conclusion
3
4
Medical Professionals – Web
Search and Data
• Online information search on a regular basis
• Search failure for 2 patients out of 3
• PubMed search: very long (30+ minutes against 5
available)
• Knowledge production
constantly growing
• More and more publications
• Varying web access
5
Medical Professionals – Web
Search and Data
6
Patients and general public
• Change in the patient-physician relationship
• Patients more committed - cybercondria
• How can information quality be guaranteed?
7
Patients – Web Search and Data
8
Patients – Web Search and Data
9
Patients – Web Search and Data
Medical Information Retrieval
• How different is medical IR from general IR?
– Domain-specific search: narrowing down the
applications to improve results for categories of users
– Consequences of bad performances of a medical search
system
• Characteristics of medical IR:
– Data: medical/clinical reports, research papers, medical
websites…
– Information need: decision support,
technology/progress watch, education, daily care…
– Evaluation: relevance, readability, trustworthiness,
time
10
Evaluating Information Retrieval?
Did the user find the information she needed?
How many relevant documents did she get back?
What is a relevant document?
How many unrelevant document did she get back?
How long before she found the information?
Is she satisfied with the results?
…
Did the user find the information she needed?
How many relevant documents did she get back?
What is a relevant document?
How many unrelevant document did she get back?
How long before she found the information?
Is she satisfied with the results?
…
• Creation of (artificial) datasets representing a specific search
task, in order to compare various systems efficiency
• Involving human rating
• Shared with the community to improve IR
11
Typical IR Evaluation Dataset
Document Collection
Topic Set Relevance
Assessment
...
...
12
Existing Medical IR evaluation tasks
• Existing medical IR evaluation tasks:
 TREC Medical Records 2011, 2012
 TREC 2000 filtering track (corpus OHSUMED)
 TREC genomics 2003-2007
 ImageCLEFMed 2005-2013
 TREC clinical decision support 2014, 2015
No patient-centered evaluation task
13
Presentation Overview
• Medical IR and its Evaluation
• CLEF eHealth
– Context and tasks
– IR tasks description
– Datasets
– Evaluation
– Participation
• Conclusion
14
CLEF eHealth
AP: 72 yo w/ ESRD on HD,
CAD, HTN, asthma, p/w
significant hyperkalemia &
associated arrythmias.
15
CLEF eHealth Tasks
2013
• Task 1: Named entity
recognition in clinical
text
• Task 2: acronym
normalization in clinical
text
• Task 3: User-centred
health IR
2014
• Task 1: Visual-Interactive
Search and Exploration
of eHealth Data
• Task 2: Information
extraction from clinical
text
• Task 3: User-centred
health IR
2015
• Task 1a: Clinical speech recognition from nurses handover
• Task 1b: Clinical named entity recognition in French
• Task 2: User-centred health IR 16
Presentation Overview
• Medical IR and its Evaluation
• CLEF eHealth
– Context and tasks
– IR tasks description
– Datasets
– Evaluation
– Participation
• Conclusion
17
2013-2014
IR Evaluation Task Scenario
2015
18
IR Evaluation Task over the years
2013 2014 2015
Goal Help laypersons better
understand medical reports
Layperson checking
their symptoms
Topics 55 EN topics
built from
discharge
summaries
55 EN topics +
translation in
CZ, DE, FR
67 EN topics built from
images + translation in
AR, CZ, DE, FA, FR, IT,
PT
Documents Medical document collection provided by Khresmoi project
Relevance
assessment
Manual evaluation of relevance
of documents
Manual evaluation of
relevance and
readability of
documents
19
Presentation Overview
• Medical IR and its Evaluation
• CLEF eHealth
– Context and tasks
– IR tasks description
– Datasets
– Evaluation
– Participation
• Conclusion
20
Document Collection
• Web crawl of health-related documents (~ 1M)
• Made available through the Khresmoi project
(khresmoi.eu)
• Target: general public and medical professionals
• Broad range of medical topics covered
• Content:
• Health On the Net (HON) Foundation certified
websites (~60%)
• Various well-known medical websites: DrugBank,
Diagnosia, TRIP answers, etc. (~40%)
21
Topics &
context
Topics
2013
Manual creation
from randomly
selected annotation
of disorder in the
DS (context)
2014
Manual creation
from manually
identified main
disorders in the DS
(context)
2015
Manual creation from images describing
a medical problem (context)
22
Topics - Examples
<topic> <id>qtest3</id>
<discharge_summary>02115-010823-
DISCHARGE_SUMMARY.txt</discharge_summary>
<title>Asystolic arrest</title>
<desc>what is asystolic arrest</desc>
<narr>asystolic arrest and why does it cause death</narr>
<profile>A 87 year old woman with a stroke and asystolic arrest dies and
the daughter wants to know about asystolic arrest and what it
means.</profile>
</topic>
2013-2014
<topic> <id>clef2015.test.15</id>
<query>weird brown patches on skin</query>
</topic>
2015
23
Datasets - Summary
• Provided to the participants:
• Document collection
• Discharge summaries (optional) [2013-2014]
• Training set:
– 5 queries + qrels [2013]
– 5 queries (+ translation) + qrels [2014-2015]
• Test set:
– 50 queries [2013]
– 50 queries (+ translation) [2014]
– 62 queries (+ translation) [2015]
24
Presentation Overview
• Medical IR and its Evaluation
• CLEF eHealth
– Context and tasks
– IR tasks description
– Datasets
– Evaluation
– Participation
• Conclusion
25
Guidelines for Submissions
26
Submission of up to 7 runs (per language):
Run 1 (mandatory) - team baseline: only title and
description fields, no external resources.
Runs 2-4 (optional) any experiment WITH the DS.
Runs 5-7 (optional) any experiment WITHOUT the DS.
2013 - 2014
Submission of up to 10 ranked runs (per language):
Run 1 (mandatory): baseline run
Runs 2-10: any experiment with any external resource
2015
Relevance Assessment
 Manual relevance assessment conducted by medical
professionals and IR experts
 4-point scale assessment mapped to a binary scale
– {0: non relevant, 1: on topic but unreliable} → non
relevant
– {2: somewhat relevant, 3: relevant} → relevant
 4-point scale for NDCG and 2-point scale for precision
 [2015] Manual assessment of the readability of the
documents conducted by the same assessors on a 4-
point scale
27
Relevance Assessment - Pools
Training set Test set
2013 Merged top 30 ranked
documents from Vector
Space Model and Okapi
BM25
Merged top 10 documents
from participants baseline
run, the highest two priority
runs with DS and highest
two without DS
2014
2015 Merged top 10 documents
from participants three
highest priority runs
28
Evaluation Metrics
• Classical TREC evaluation: P@5, P@10,
NDCG@5, NDCG@10, MAP
• Ranking based on P@10
29
Presentation Overview
• Medical IR and its Evaluation
• CLEF eHealth
– Context and tasks
– IR tasks description
– Datasets
– Evaluation
– Participation
• Conclusion
30
Participants and Runs
Monolingual IR Multilingual IR
# teams # runs # teams # runs
2013 9 48 -- --
2014 14 62 2 24
2015 12 92 1 35
31
Baselines
2013:
• JSoup
• Okapi stop words & Porter stemmer
• Lucene BM25
2014:
• Indri HTML parser
• Okapi stop words & Krovetz stemmer
• Indri BM25, tf.idf, LM
32
33
2013 Participants P@10 (best run)
Team-Mayo (2)
Team-AEHRC (5)
Team-MEDINFO (1)
Team-UOG (5)
Team-THCIB (5)
Team-KC (1)
Team-UTHealth (1)
Team-QUT (2)
Team-OHSU (5)
0
0.1
0.2
0.3
0.4
0.5
0.6
BM25 BM25 +
PRF
2014 Task 3a P@10 (best run)GRIUM_EN_Run.5
SNUMEDINFO_EN_Run.2
KISTI_EN_Run.2
IRLabDAIICT_EN_Run.1
UIOWA_EN_Run.1
baseline.dir
DEMIR_EN_Run.6
RePaLi_EN_Run.5
NIJM_EN_Run.2
YORKU_EN_Run.5
UHU_EN_Run.5
COMPL_EN_Run.5
ERIAS_EN_Run.6
miracl_en_run.1
CUNI_EN_RUN.5
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
34
35
Participants P@10 (2013 and 2014)
P@10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
2013
2014
BM25 2013
LM Dirichlet smoothing 2014
35
Team-Mayo Team-AEHRCTeam-MEDINFO Team-UOG Team-THCIB Team-KC Team-UTHealth Team-QUT Team-OHSU
0
0.1
0.2
0.3
0.4
0.5
0.6
Baseline
Best run
36
2013 Participants' Results
Baseline vs best run
What Worked Well?
Team-Mayo:
• Markov Model Random Field to model query term
dependency
• QE using external collections
• Combination of indexing techniques + re-ranking
Team-AEHRC:
• Language Models with Dirichlet smoothing
• QE with spelling correction and acronym expansion
Team-MEDINFO: Query Likelihood Model
BM25 Baseline
37
COMPL
CUNI
DEMIR
ERIAS
GRIUM
IRLabDAIICT
KISTI
miracl_en_run.1
NIJM
RePaLi
SNUMEDINFO
UHU
UIOWA
YORKU
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Baseline
Best run
2014 Participant's Results
Baseline vs best run
38
What Worked Well?
Team-GRIUM:
• Hybrid IR approach (text-based and concept-based)`
• Language models
• Query expansion based on mutual information
Team-SNUMEDINFO:
• Language Models with Dirichlet smoothing
• QE with medical concepts
• Google translate
Team-KISTI:
• Language models
• Various QE approaches
39
Task 3b Results
CS DE FR
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
CUNI
SNUMEDINFO
40
41
2013 - Use of Discharge Summaries
Team-Mayo Team-Medinfo Team-THCIB Team-KC Team-QUT
0
0.1
0.2
0.3
0.4
0.5
0.6
With DS
Without DS
Baseline
42
How were DS used?
- Result re-ranking based on concepts extracted from
queries, relevant documents and DS (Team-Mayo)
- Query expansion:
* Filtering of non-relevant expansion terms/concepts
(Team-MEDINFO)
* Expansion with all concepts from query and DS (Team-
THCIB)
* Expansion with concepts identified in relevant passages
of the DS (Team-KC)
* Query refinement (Team-TOPSIG)
2014 - Use of Discharge
Summaries
IRLabDAIICT KISTI NIJM
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
DS
No DS
43
How Were DS Used?
●Query expansion:
● Expansion using Metamap, with expansion
candidates filtered using the DS (Team-
SNUMEDINFO)
● Expansion with abbreviations and DS combined
with pseudo-relevance feedback (Team-KISTI)
● Expansion with MeSH terminology and DS (Team-
IRLABDAIICT)
● Expansion with terms from the DS (Team-
Nijmegen)
44
Presentation Overview
• Medical IR and its Evaluation
• CLEF eHealth
– Context and tasks
– IR tasks description
– Datasets
– Evaluation
– Participation
– Further analysis
• Conclusion
45
46
Medical Queries Complexity
 Query complexity = number of medical
concepts/entities it contains
 radial neck fracture and healing time
 facial cuts and scar tissue
 nausea and vomiting and hematemesis
 Dataset:
 50 queries from CLEF eHealth 2013 (patients
queries)
 Runs from 9 teams
 Impact of the complexity on the systems
performances
47
Medical Queries Complexity
Presentation Overview
• Medical IR and its Evaluation
• CLEF eHealth
– Context and tasks
– IR tasks description
– Datasets
– Evaluation
– Participation
– Further analysis
• Conclusion
48
Conclusion
• 3 successful years running CLEF eHealth
• Datasets are publicly available for research
purpose
• Used for research by organizers, participants,
and other groups
• Building a community – evaluation tasks,
workshop@SIGIR, special edition of JIR
49
For More Details
CLEF eHealth Lab overview:
Suominen et al. (2013). Overview of the ShARe/CLEF eHealth
Evaluation Lab 2013. In CLEF 2013 Proceedings.
Kelly et al. (2014). Overview of the ShARe/CLEF eHealth
Evaluation Lab 2014. In CLEF 2014 Proceedings.
CLEF eHealth IR task overview:
Goeuriot et al. (2013). ShAReCLEF eHealth Evaluation Lab
2013, Task 3: Information Retrieval to Address Patients’
Questions when Reading Clinical Reports. In CLEF 2013
Working notes.
Goeuriot et al. (2014). ShARe/CLEF eHealth Evaluation Lab
2014, Task 3: User-centred health information retrieval. In
CLEF 2013 Working notes.
50
Follow us!
http://sites.google.com/site/clefehealth2015
clef-ehealth-evaluation-lab-information
On Google groups
@clefehealth
Join the party in Toulouse: http://clef2015.clef-
initiative.eu/CLEF2015/conferenceRegistration.php
51
Consortium
• Lab chairs: Lorraine Goeuriot, Liadh Kelly
• Task 1: Hanna Suominen, Leif Hanlen, Gareth
Jones, Liyuan Zhou, Aurélie Névéol, Cyril
Grouin, Thierry Hamon, Pierre Zweigenbaum
• Task 2: Joao Palotti, Guido Zuccon, Allan
Hanbury, Mihai Lupu, Pavel Pecina
52
Thank you! Questions?
53
Task 3a - Topic Generation Process (1)
Discharge Medications:
1. Aspirin 81 mg Tablet, Delayed Release (E.C.) Sig: One (1) Tablet, Delayed Release (E.C.) PO
DAILY (Daily). Disp:*30 Tablet, Delayed Release (E.C.)(s)* Refills:*0*
2. Docusate Sodium 100 mg Capsule Sig: One (1) Capsule PO BID (2 times a day). Disp:*60
Capsule(s)* Refills:*0*
3. Levothyroxine Sodium 200 mcg Tablet Sig: One (1) Tablet PO DAILY (Daily).
Discharge Disposition:
Extended Care
Facility:
[**Hospital 5805**] Manor - [**Location (un) 348**]
Discharge Diagnosis:
Coronary artery disease.
s/p CABG
post op atrial fibrillation
54
Task 3a - Topic Generation Process (2)
Discharge Medications:
1. Aspirin 81 mg Tablet, Delayed Release (E.C.) Sig: One (1) Tablet, Delayed Release (E.C.) PO
DAILY (Daily). Disp:*30 Tablet, Delayed Release (E.C.)(s)* Refills:*0*
2. Docusate Sodium 100 mg Capsule Sig: One (1) Capsule PO BID (2 times a day). Disp:*60
Capsule(s)* Refills:*0*
3. Levothyroxine Sodium 200 mcg Tablet Sig: One (1) Tablet PO DAILY (Daily).
Discharge Disposition:
Extended Care
Facility:
[**Hospital 5805**] Manor - [**Location (un) 348**]
Discharge Diagnosis:
Coronary artery disease.
s/p CABG
post op atrial fibrillation
55
Task 3a - Topic Generation Process (3)
Discharge Medications:
1. Aspirin 81 mg Tablet, Delayed Release (E.C.) Sig: One (1) Tablet, Delayed Release (E.C.) PO
DAILY (Daily). Disp:*30 Tablet, Delayed Release (E.C.)(s)* Refills:*0*
2. Docusate Sodium 100 mg Capsule Sig: One (1) Capsule PO BID (2 times a day). Disp:*60
Capsule(s)* Refills:*0*
3. Levothyroxine Sodium 200 mcg Tablet Sig: One (1) Tablet PO DAILY (Daily).
Discharge Disposition:
Extended Care
Facility:
[**Hospital 5805**] Manor - [**Location (un) 348**]
Discharge Diagnosis:
Coronary artery disease.
s/p CABG
post op atrial fibrillation
What is coronary heart disease?
56
57
Participants Approaches
57

More Related Content

Viewers also liked

Information Retrieval UMLS
Information Retrieval UMLSInformation Retrieval UMLS
Information Retrieval UMLSsiddarajuss
 
IRIT at clef 2015: A product search model for head queries
IRIT at clef 2015: A product search model for head queriesIRIT at clef 2015: A product search model for head queries
IRIT at clef 2015: A product search model for head queriesLamjed Ben Jabeur
 
Information Retrieval
Information RetrievalInformation Retrieval
Information RetrievalPhuong Pham
 
Unified Medical Language System & MetaMap
Unified Medical Language System & MetaMapUnified Medical Language System & MetaMap
Unified Medical Language System & MetaMapOsama Jomaa
 
Description and retrieval of medical visual information based on language mod...
Description and retrieval of medical visual information based on language mod...Description and retrieval of medical visual information based on language mod...
Description and retrieval of medical visual information based on language mod...Antonio Foncubierta Rodriguez
 
2. Triết học MBA (LS triết học Ấn Độ)
2. Triết học MBA (LS triết học Ấn Độ)2. Triết học MBA (LS triết học Ấn Độ)
2. Triết học MBA (LS triết học Ấn Độ)Hưng, Đinh Duy
 
Challenges of managing Data Science Project
Challenges of managing Data Science ProjectChallenges of managing Data Science Project
Challenges of managing Data Science ProjectLamjed Ben Jabeur
 
臺灣壽險業概論(2016.11)
臺灣壽險業概論(2016.11)臺灣壽險業概論(2016.11)
臺灣壽險業概論(2016.11)c34031328
 
Quels facteurs de pertinence pour la recherche de produits e-commerce ?
Quels facteurs de pertinence pour la recherche de produits e-commerce ?Quels facteurs de pertinence pour la recherche de produits e-commerce ?
Quels facteurs de pertinence pour la recherche de produits e-commerce ?Lamjed Ben Jabeur
 
Método lrfd publicado por luis quispe apaza
Método lrfd  publicado por luis quispe apazaMétodo lrfd  publicado por luis quispe apaza
Método lrfd publicado por luis quispe apazaluis41977826
 
Understanding Risk Stratification, Comorbidities, and the Future of Healthcare
Understanding Risk Stratification, Comorbidities, and the Future of HealthcareUnderstanding Risk Stratification, Comorbidities, and the Future of Healthcare
Understanding Risk Stratification, Comorbidities, and the Future of HealthcareHealth Catalyst
 
Three Approaches to Predictive Analytics in Healthcare
Three Approaches to Predictive Analytics in HealthcareThree Approaches to Predictive Analytics in Healthcare
Three Approaches to Predictive Analytics in HealthcareHealth Catalyst
 
4 Essential Lessons for Adopting Predictive Analytics in Healthcare
4 Essential Lessons for Adopting Predictive Analytics in Healthcare4 Essential Lessons for Adopting Predictive Analytics in Healthcare
4 Essential Lessons for Adopting Predictive Analytics in HealthcareHealth Catalyst
 
Introduction au web sémantique : quand le lient fait sens
Introduction au web sémantique : quand le lient fait sensIntroduction au web sémantique : quand le lient fait sens
Introduction au web sémantique : quand le lient fait sensFICEL Hemza
 
Introduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsIntroduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsMounia Lalmas-Roelleke
 
Accès à l’information dans les réseaux sociaux : quelles formes de collaborat...
Accès à l’information dans les réseaux sociaux : quelles formes de collaborat...Accès à l’information dans les réseaux sociaux : quelles formes de collaborat...
Accès à l’information dans les réseaux sociaux : quelles formes de collaborat...Lamjed Ben Jabeur
 
Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...
Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...
Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...Bilel Moulahi
 
Quality Improvement In Healthcare: Where Is The Best Place To Start?
Quality Improvement In Healthcare: Where Is The Best Place To Start?Quality Improvement In Healthcare: Where Is The Best Place To Start?
Quality Improvement In Healthcare: Where Is The Best Place To Start?Health Catalyst
 
Aussenac confinvitéeic2014 histoire ic25ans
Aussenac confinvitéeic2014 histoire ic25ansAussenac confinvitéeic2014 histoire ic25ans
Aussenac confinvitéeic2014 histoire ic25ansNathalie Aussenac-Gilles
 

Viewers also liked (20)

Information Retrieval UMLS
Information Retrieval UMLSInformation Retrieval UMLS
Information Retrieval UMLS
 
IRIT at clef 2015: A product search model for head queries
IRIT at clef 2015: A product search model for head queriesIRIT at clef 2015: A product search model for head queries
IRIT at clef 2015: A product search model for head queries
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
 
Unified Medical Language System & MetaMap
Unified Medical Language System & MetaMapUnified Medical Language System & MetaMap
Unified Medical Language System & MetaMap
 
Description and retrieval of medical visual information based on language mod...
Description and retrieval of medical visual information based on language mod...Description and retrieval of medical visual information based on language mod...
Description and retrieval of medical visual information based on language mod...
 
2. Triết học MBA (LS triết học Ấn Độ)
2. Triết học MBA (LS triết học Ấn Độ)2. Triết học MBA (LS triết học Ấn Độ)
2. Triết học MBA (LS triết học Ấn Độ)
 
Challenges of managing Data Science Project
Challenges of managing Data Science ProjectChallenges of managing Data Science Project
Challenges of managing Data Science Project
 
臺灣壽險業概論(2016.11)
臺灣壽險業概論(2016.11)臺灣壽險業概論(2016.11)
臺灣壽險業概論(2016.11)
 
Quels facteurs de pertinence pour la recherche de produits e-commerce ?
Quels facteurs de pertinence pour la recherche de produits e-commerce ?Quels facteurs de pertinence pour la recherche de produits e-commerce ?
Quels facteurs de pertinence pour la recherche de produits e-commerce ?
 
Método lrfd publicado por luis quispe apaza
Método lrfd  publicado por luis quispe apazaMétodo lrfd  publicado por luis quispe apaza
Método lrfd publicado por luis quispe apaza
 
Understanding Risk Stratification, Comorbidities, and the Future of Healthcare
Understanding Risk Stratification, Comorbidities, and the Future of HealthcareUnderstanding Risk Stratification, Comorbidities, and the Future of Healthcare
Understanding Risk Stratification, Comorbidities, and the Future of Healthcare
 
Three Approaches to Predictive Analytics in Healthcare
Three Approaches to Predictive Analytics in HealthcareThree Approaches to Predictive Analytics in Healthcare
Three Approaches to Predictive Analytics in Healthcare
 
4 Essential Lessons for Adopting Predictive Analytics in Healthcare
4 Essential Lessons for Adopting Predictive Analytics in Healthcare4 Essential Lessons for Adopting Predictive Analytics in Healthcare
4 Essential Lessons for Adopting Predictive Analytics in Healthcare
 
Introduction au web sémantique : quand le lient fait sens
Introduction au web sémantique : quand le lient fait sensIntroduction au web sémantique : quand le lient fait sens
Introduction au web sémantique : quand le lient fait sens
 
Introduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsIntroduction to Information Retrieval & Models
Introduction to Information Retrieval & Models
 
Accès à l’information dans les réseaux sociaux : quelles formes de collaborat...
Accès à l’information dans les réseaux sociaux : quelles formes de collaborat...Accès à l’information dans les réseaux sociaux : quelles formes de collaborat...
Accès à l’information dans les réseaux sociaux : quelles formes de collaborat...
 
Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...
Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...
Définition et évaluation de modèles d'agrégation pour l'estimation de la pert...
 
Quality Improvement In Healthcare: Where Is The Best Place To Start?
Quality Improvement In Healthcare: Where Is The Best Place To Start?Quality Improvement In Healthcare: Where Is The Best Place To Start?
Quality Improvement In Healthcare: Where Is The Best Place To Start?
 
Aussenac confinvitéeic2014 histoire ic25ans
Aussenac confinvitéeic2014 histoire ic25ansAussenac confinvitéeic2014 histoire ic25ans
Aussenac confinvitéeic2014 histoire ic25ans
 
Aussenac ri ia-2015
Aussenac ri ia-2015Aussenac ri ia-2015
Aussenac ri ia-2015
 

Similar to Medical Information Retrieval and its Evaluation: an Overview of CLEF eHealth Evaluation Task

C535 madden et al – monitoring community based rehabilitation and use of the icf
C535 madden et al – monitoring community based rehabilitation and use of the icfC535 madden et al – monitoring community based rehabilitation and use of the icf
C535 madden et al – monitoring community based rehabilitation and use of the icfStefanus Snyman
 
Principles for good metrics: theory to practice
Principles for good metrics: theory to practicePrinciples for good metrics: theory to practice
Principles for good metrics: theory to practiceAlan Fricker
 
Keeping up with Public Health Series: A Pilot Project for Public Health Resea...
Keeping up with Public Health Series: A Pilot Project for Public Health Resea...Keeping up with Public Health Series: A Pilot Project for Public Health Resea...
Keeping up with Public Health Series: A Pilot Project for Public Health Resea...Douglas Joubert
 
Vac workload analysis ppx jan 26 2016
Vac workload analysis   ppx jan 26 2016Vac workload analysis   ppx jan 26 2016
Vac workload analysis ppx jan 26 2016Alex Thibault
 
The Basics of Monitoring, Evaluation and Supervision of Health Services in Nepal
The Basics of Monitoring, Evaluation and Supervision of Health Services in NepalThe Basics of Monitoring, Evaluation and Supervision of Health Services in Nepal
The Basics of Monitoring, Evaluation and Supervision of Health Services in NepalDeepak Karki
 
REG IPF / ILD Working Group Meeting
REG IPF / ILD Working Group MeetingREG IPF / ILD Working Group Meeting
REG IPF / ILD Working Group MeetingZoe Mitchell
 
Operational Capacity Analysis
Operational Capacity AnalysisOperational Capacity Analysis
Operational Capacity Analysisfmi_igf
 
Shing Lee MedicReS World Congress 2015
Shing Lee MedicReS World Congress 2015Shing Lee MedicReS World Congress 2015
Shing Lee MedicReS World Congress 2015MedicReS
 
Metrics Workshop for YOHHLNet
Metrics Workshop for YOHHLNetMetrics Workshop for YOHHLNet
Metrics Workshop for YOHHLNetAlan Fricker
 
Score iSYS Health Apps
Score iSYS Health AppsScore iSYS Health Apps
Score iSYS Health AppsFunancion iSYS
 
HIV/STD Electronic Lab Reporting
HIV/STD Electronic Lab ReportingHIV/STD Electronic Lab Reporting
HIV/STD Electronic Lab ReportingKellieWatkins1
 
Registry Participation 101: A Step-by-Step Guide to What You Really Need to K...
Registry Participation 101: A Step-by-Step Guide to What You Really Need to K...Registry Participation 101: A Step-by-Step Guide to What You Really Need to K...
Registry Participation 101: A Step-by-Step Guide to What You Really Need to K...Wellbe
 
Presentation on the literature review of interventions to improve health care...
Presentation on the literature review of interventions to improve health care...Presentation on the literature review of interventions to improve health care...
Presentation on the literature review of interventions to improve health care...IDS
 
How to Implement Quality in Health Care Organizations.
How to Implement Quality in Health Care Organizations.How to Implement Quality in Health Care Organizations.
How to Implement Quality in Health Care Organizations.Healthcare consultant
 
Readiness to Train Assessment Tool™ - National Launch
Readiness to Train Assessment Tool™ - National LaunchReadiness to Train Assessment Tool™ - National Launch
Readiness to Train Assessment Tool™ - National LaunchCHC Connecticut
 
Biased Information Retrieval in Pharmaceutical Drug Development
Biased Information Retrieval in Pharmaceutical Drug DevelopmentBiased Information Retrieval in Pharmaceutical Drug Development
Biased Information Retrieval in Pharmaceutical Drug DevelopmentDr. Haxel Consult
 
Using Hospital Data Insights to Improve Clinical Quality at UCLA Medical Cent...
Using Hospital Data Insights to Improve Clinical Quality at UCLA Medical Cent...Using Hospital Data Insights to Improve Clinical Quality at UCLA Medical Cent...
Using Hospital Data Insights to Improve Clinical Quality at UCLA Medical Cent...U.S. News Healthcare of Tomorrow
 
UCSF Informatics Day 2014 - David Dobbs, "Enterprise Data Warehouse"
UCSF Informatics Day 2014 - David Dobbs, "Enterprise Data Warehouse"UCSF Informatics Day 2014 - David Dobbs, "Enterprise Data Warehouse"
UCSF Informatics Day 2014 - David Dobbs, "Enterprise Data Warehouse"CTSI at UCSF
 

Similar to Medical Information Retrieval and its Evaluation: an Overview of CLEF eHealth Evaluation Task (20)

C535 madden et al – monitoring community based rehabilitation and use of the icf
C535 madden et al – monitoring community based rehabilitation and use of the icfC535 madden et al – monitoring community based rehabilitation and use of the icf
C535 madden et al – monitoring community based rehabilitation and use of the icf
 
Principles for good metrics: theory to practice
Principles for good metrics: theory to practicePrinciples for good metrics: theory to practice
Principles for good metrics: theory to practice
 
Hm 418 harris ch09 ppt
Hm 418 harris ch09 pptHm 418 harris ch09 ppt
Hm 418 harris ch09 ppt
 
Keeping up with Public Health Series: A Pilot Project for Public Health Resea...
Keeping up with Public Health Series: A Pilot Project for Public Health Resea...Keeping up with Public Health Series: A Pilot Project for Public Health Resea...
Keeping up with Public Health Series: A Pilot Project for Public Health Resea...
 
Vac workload analysis ppx jan 26 2016
Vac workload analysis   ppx jan 26 2016Vac workload analysis   ppx jan 26 2016
Vac workload analysis ppx jan 26 2016
 
The Basics of Monitoring, Evaluation and Supervision of Health Services in Nepal
The Basics of Monitoring, Evaluation and Supervision of Health Services in NepalThe Basics of Monitoring, Evaluation and Supervision of Health Services in Nepal
The Basics of Monitoring, Evaluation and Supervision of Health Services in Nepal
 
REG IPF / ILD Working Group Meeting
REG IPF / ILD Working Group MeetingREG IPF / ILD Working Group Meeting
REG IPF / ILD Working Group Meeting
 
Operational Capacity Analysis
Operational Capacity AnalysisOperational Capacity Analysis
Operational Capacity Analysis
 
Shing Lee MedicReS World Congress 2015
Shing Lee MedicReS World Congress 2015Shing Lee MedicReS World Congress 2015
Shing Lee MedicReS World Congress 2015
 
Metrics Workshop for YOHHLNet
Metrics Workshop for YOHHLNetMetrics Workshop for YOHHLNet
Metrics Workshop for YOHHLNet
 
Score iSYS Health Apps
Score iSYS Health AppsScore iSYS Health Apps
Score iSYS Health Apps
 
HIV/STD Electronic Lab Reporting
HIV/STD Electronic Lab ReportingHIV/STD Electronic Lab Reporting
HIV/STD Electronic Lab Reporting
 
Registry Participation 101: A Step-by-Step Guide to What You Really Need to K...
Registry Participation 101: A Step-by-Step Guide to What You Really Need to K...Registry Participation 101: A Step-by-Step Guide to What You Really Need to K...
Registry Participation 101: A Step-by-Step Guide to What You Really Need to K...
 
Presentation on the literature review of interventions to improve health care...
Presentation on the literature review of interventions to improve health care...Presentation on the literature review of interventions to improve health care...
Presentation on the literature review of interventions to improve health care...
 
How to Implement Quality in Health Care Organizations.
How to Implement Quality in Health Care Organizations.How to Implement Quality in Health Care Organizations.
How to Implement Quality in Health Care Organizations.
 
PPX January 2016 LE
PPX January 2016 LEPPX January 2016 LE
PPX January 2016 LE
 
Readiness to Train Assessment Tool™ - National Launch
Readiness to Train Assessment Tool™ - National LaunchReadiness to Train Assessment Tool™ - National Launch
Readiness to Train Assessment Tool™ - National Launch
 
Biased Information Retrieval in Pharmaceutical Drug Development
Biased Information Retrieval in Pharmaceutical Drug DevelopmentBiased Information Retrieval in Pharmaceutical Drug Development
Biased Information Retrieval in Pharmaceutical Drug Development
 
Using Hospital Data Insights to Improve Clinical Quality at UCLA Medical Cent...
Using Hospital Data Insights to Improve Clinical Quality at UCLA Medical Cent...Using Hospital Data Insights to Improve Clinical Quality at UCLA Medical Cent...
Using Hospital Data Insights to Improve Clinical Quality at UCLA Medical Cent...
 
UCSF Informatics Day 2014 - David Dobbs, "Enterprise Data Warehouse"
UCSF Informatics Day 2014 - David Dobbs, "Enterprise Data Warehouse"UCSF Informatics Day 2014 - David Dobbs, "Enterprise Data Warehouse"
UCSF Informatics Day 2014 - David Dobbs, "Enterprise Data Warehouse"
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Medical Information Retrieval and its Evaluation: an Overview of CLEF eHealth Evaluation Task

  • 1. Medical Information Retrieval and its Evaluation: an Overview of CLEF eHealth Evaluation Task Lorraine Goeuriot LIG – Université Grenoble Alpes (France) lorraine.goeuriot@imag.fr
  • 2. Presentation Overview • Medical IR and its Evaluation • CLEF eHealth – Context and tasks – IR tasks description – Datasets – Evaluation – Participation • Conclusion 2
  • 3. Presentation Overview • Medical IR and its Evaluation • CLEF eHealth – Context and tasks – IR tasks description – Datasets – Evaluation – Participation • Conclusion 3
  • 4. 4 Medical Professionals – Web Search and Data • Online information search on a regular basis • Search failure for 2 patients out of 3 • PubMed search: very long (30+ minutes against 5 available) • Knowledge production constantly growing • More and more publications • Varying web access
  • 5. 5 Medical Professionals – Web Search and Data
  • 6. 6 Patients and general public • Change in the patient-physician relationship • Patients more committed - cybercondria • How can information quality be guaranteed?
  • 7. 7 Patients – Web Search and Data
  • 8. 8 Patients – Web Search and Data
  • 9. 9 Patients – Web Search and Data
  • 10. Medical Information Retrieval • How different is medical IR from general IR? – Domain-specific search: narrowing down the applications to improve results for categories of users – Consequences of bad performances of a medical search system • Characteristics of medical IR: – Data: medical/clinical reports, research papers, medical websites… – Information need: decision support, technology/progress watch, education, daily care… – Evaluation: relevance, readability, trustworthiness, time 10
  • 11. Evaluating Information Retrieval? Did the user find the information she needed? How many relevant documents did she get back? What is a relevant document? How many unrelevant document did she get back? How long before she found the information? Is she satisfied with the results? … Did the user find the information she needed? How many relevant documents did she get back? What is a relevant document? How many unrelevant document did she get back? How long before she found the information? Is she satisfied with the results? … • Creation of (artificial) datasets representing a specific search task, in order to compare various systems efficiency • Involving human rating • Shared with the community to improve IR 11
  • 12. Typical IR Evaluation Dataset Document Collection Topic Set Relevance Assessment ... ... 12
  • 13. Existing Medical IR evaluation tasks • Existing medical IR evaluation tasks:  TREC Medical Records 2011, 2012  TREC 2000 filtering track (corpus OHSUMED)  TREC genomics 2003-2007  ImageCLEFMed 2005-2013  TREC clinical decision support 2014, 2015 No patient-centered evaluation task 13
  • 14. Presentation Overview • Medical IR and its Evaluation • CLEF eHealth – Context and tasks – IR tasks description – Datasets – Evaluation – Participation • Conclusion 14
  • 15. CLEF eHealth AP: 72 yo w/ ESRD on HD, CAD, HTN, asthma, p/w significant hyperkalemia & associated arrythmias. 15
  • 16. CLEF eHealth Tasks 2013 • Task 1: Named entity recognition in clinical text • Task 2: acronym normalization in clinical text • Task 3: User-centred health IR 2014 • Task 1: Visual-Interactive Search and Exploration of eHealth Data • Task 2: Information extraction from clinical text • Task 3: User-centred health IR 2015 • Task 1a: Clinical speech recognition from nurses handover • Task 1b: Clinical named entity recognition in French • Task 2: User-centred health IR 16
  • 17. Presentation Overview • Medical IR and its Evaluation • CLEF eHealth – Context and tasks – IR tasks description – Datasets – Evaluation – Participation • Conclusion 17
  • 18. 2013-2014 IR Evaluation Task Scenario 2015 18
  • 19. IR Evaluation Task over the years 2013 2014 2015 Goal Help laypersons better understand medical reports Layperson checking their symptoms Topics 55 EN topics built from discharge summaries 55 EN topics + translation in CZ, DE, FR 67 EN topics built from images + translation in AR, CZ, DE, FA, FR, IT, PT Documents Medical document collection provided by Khresmoi project Relevance assessment Manual evaluation of relevance of documents Manual evaluation of relevance and readability of documents 19
  • 20. Presentation Overview • Medical IR and its Evaluation • CLEF eHealth – Context and tasks – IR tasks description – Datasets – Evaluation – Participation • Conclusion 20
  • 21. Document Collection • Web crawl of health-related documents (~ 1M) • Made available through the Khresmoi project (khresmoi.eu) • Target: general public and medical professionals • Broad range of medical topics covered • Content: • Health On the Net (HON) Foundation certified websites (~60%) • Various well-known medical websites: DrugBank, Diagnosia, TRIP answers, etc. (~40%) 21
  • 22. Topics & context Topics 2013 Manual creation from randomly selected annotation of disorder in the DS (context) 2014 Manual creation from manually identified main disorders in the DS (context) 2015 Manual creation from images describing a medical problem (context) 22
  • 23. Topics - Examples <topic> <id>qtest3</id> <discharge_summary>02115-010823- DISCHARGE_SUMMARY.txt</discharge_summary> <title>Asystolic arrest</title> <desc>what is asystolic arrest</desc> <narr>asystolic arrest and why does it cause death</narr> <profile>A 87 year old woman with a stroke and asystolic arrest dies and the daughter wants to know about asystolic arrest and what it means.</profile> </topic> 2013-2014 <topic> <id>clef2015.test.15</id> <query>weird brown patches on skin</query> </topic> 2015 23
  • 24. Datasets - Summary • Provided to the participants: • Document collection • Discharge summaries (optional) [2013-2014] • Training set: – 5 queries + qrels [2013] – 5 queries (+ translation) + qrels [2014-2015] • Test set: – 50 queries [2013] – 50 queries (+ translation) [2014] – 62 queries (+ translation) [2015] 24
  • 25. Presentation Overview • Medical IR and its Evaluation • CLEF eHealth – Context and tasks – IR tasks description – Datasets – Evaluation – Participation • Conclusion 25
  • 26. Guidelines for Submissions 26 Submission of up to 7 runs (per language): Run 1 (mandatory) - team baseline: only title and description fields, no external resources. Runs 2-4 (optional) any experiment WITH the DS. Runs 5-7 (optional) any experiment WITHOUT the DS. 2013 - 2014 Submission of up to 10 ranked runs (per language): Run 1 (mandatory): baseline run Runs 2-10: any experiment with any external resource 2015
  • 27. Relevance Assessment  Manual relevance assessment conducted by medical professionals and IR experts  4-point scale assessment mapped to a binary scale – {0: non relevant, 1: on topic but unreliable} → non relevant – {2: somewhat relevant, 3: relevant} → relevant  4-point scale for NDCG and 2-point scale for precision  [2015] Manual assessment of the readability of the documents conducted by the same assessors on a 4- point scale 27
  • 28. Relevance Assessment - Pools Training set Test set 2013 Merged top 30 ranked documents from Vector Space Model and Okapi BM25 Merged top 10 documents from participants baseline run, the highest two priority runs with DS and highest two without DS 2014 2015 Merged top 10 documents from participants three highest priority runs 28
  • 29. Evaluation Metrics • Classical TREC evaluation: P@5, P@10, NDCG@5, NDCG@10, MAP • Ranking based on P@10 29
  • 30. Presentation Overview • Medical IR and its Evaluation • CLEF eHealth – Context and tasks – IR tasks description – Datasets – Evaluation – Participation • Conclusion 30
  • 31. Participants and Runs Monolingual IR Multilingual IR # teams # runs # teams # runs 2013 9 48 -- -- 2014 14 62 2 24 2015 12 92 1 35 31
  • 32. Baselines 2013: • JSoup • Okapi stop words & Porter stemmer • Lucene BM25 2014: • Indri HTML parser • Okapi stop words & Krovetz stemmer • Indri BM25, tf.idf, LM 32
  • 33. 33 2013 Participants P@10 (best run) Team-Mayo (2) Team-AEHRC (5) Team-MEDINFO (1) Team-UOG (5) Team-THCIB (5) Team-KC (1) Team-UTHealth (1) Team-QUT (2) Team-OHSU (5) 0 0.1 0.2 0.3 0.4 0.5 0.6 BM25 BM25 + PRF
  • 34. 2014 Task 3a P@10 (best run)GRIUM_EN_Run.5 SNUMEDINFO_EN_Run.2 KISTI_EN_Run.2 IRLabDAIICT_EN_Run.1 UIOWA_EN_Run.1 baseline.dir DEMIR_EN_Run.6 RePaLi_EN_Run.5 NIJM_EN_Run.2 YORKU_EN_Run.5 UHU_EN_Run.5 COMPL_EN_Run.5 ERIAS_EN_Run.6 miracl_en_run.1 CUNI_EN_RUN.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 34
  • 35. 35 Participants P@10 (2013 and 2014) P@10 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 2013 2014 BM25 2013 LM Dirichlet smoothing 2014 35
  • 36. Team-Mayo Team-AEHRCTeam-MEDINFO Team-UOG Team-THCIB Team-KC Team-UTHealth Team-QUT Team-OHSU 0 0.1 0.2 0.3 0.4 0.5 0.6 Baseline Best run 36 2013 Participants' Results Baseline vs best run
  • 37. What Worked Well? Team-Mayo: • Markov Model Random Field to model query term dependency • QE using external collections • Combination of indexing techniques + re-ranking Team-AEHRC: • Language Models with Dirichlet smoothing • QE with spelling correction and acronym expansion Team-MEDINFO: Query Likelihood Model BM25 Baseline 37
  • 39. What Worked Well? Team-GRIUM: • Hybrid IR approach (text-based and concept-based)` • Language models • Query expansion based on mutual information Team-SNUMEDINFO: • Language Models with Dirichlet smoothing • QE with medical concepts • Google translate Team-KISTI: • Language models • Various QE approaches 39
  • 40. Task 3b Results CS DE FR 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 CUNI SNUMEDINFO 40
  • 41. 41 2013 - Use of Discharge Summaries Team-Mayo Team-Medinfo Team-THCIB Team-KC Team-QUT 0 0.1 0.2 0.3 0.4 0.5 0.6 With DS Without DS Baseline
  • 42. 42 How were DS used? - Result re-ranking based on concepts extracted from queries, relevant documents and DS (Team-Mayo) - Query expansion: * Filtering of non-relevant expansion terms/concepts (Team-MEDINFO) * Expansion with all concepts from query and DS (Team- THCIB) * Expansion with concepts identified in relevant passages of the DS (Team-KC) * Query refinement (Team-TOPSIG)
  • 43. 2014 - Use of Discharge Summaries IRLabDAIICT KISTI NIJM 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 DS No DS 43
  • 44. How Were DS Used? ●Query expansion: ● Expansion using Metamap, with expansion candidates filtered using the DS (Team- SNUMEDINFO) ● Expansion with abbreviations and DS combined with pseudo-relevance feedback (Team-KISTI) ● Expansion with MeSH terminology and DS (Team- IRLABDAIICT) ● Expansion with terms from the DS (Team- Nijmegen) 44
  • 45. Presentation Overview • Medical IR and its Evaluation • CLEF eHealth – Context and tasks – IR tasks description – Datasets – Evaluation – Participation – Further analysis • Conclusion 45
  • 46. 46 Medical Queries Complexity  Query complexity = number of medical concepts/entities it contains  radial neck fracture and healing time  facial cuts and scar tissue  nausea and vomiting and hematemesis  Dataset:  50 queries from CLEF eHealth 2013 (patients queries)  Runs from 9 teams  Impact of the complexity on the systems performances
  • 48. Presentation Overview • Medical IR and its Evaluation • CLEF eHealth – Context and tasks – IR tasks description – Datasets – Evaluation – Participation – Further analysis • Conclusion 48
  • 49. Conclusion • 3 successful years running CLEF eHealth • Datasets are publicly available for research purpose • Used for research by organizers, participants, and other groups • Building a community – evaluation tasks, workshop@SIGIR, special edition of JIR 49
  • 50. For More Details CLEF eHealth Lab overview: Suominen et al. (2013). Overview of the ShARe/CLEF eHealth Evaluation Lab 2013. In CLEF 2013 Proceedings. Kelly et al. (2014). Overview of the ShARe/CLEF eHealth Evaluation Lab 2014. In CLEF 2014 Proceedings. CLEF eHealth IR task overview: Goeuriot et al. (2013). ShAReCLEF eHealth Evaluation Lab 2013, Task 3: Information Retrieval to Address Patients’ Questions when Reading Clinical Reports. In CLEF 2013 Working notes. Goeuriot et al. (2014). ShARe/CLEF eHealth Evaluation Lab 2014, Task 3: User-centred health information retrieval. In CLEF 2013 Working notes. 50
  • 51. Follow us! http://sites.google.com/site/clefehealth2015 clef-ehealth-evaluation-lab-information On Google groups @clefehealth Join the party in Toulouse: http://clef2015.clef- initiative.eu/CLEF2015/conferenceRegistration.php 51
  • 52. Consortium • Lab chairs: Lorraine Goeuriot, Liadh Kelly • Task 1: Hanna Suominen, Leif Hanlen, Gareth Jones, Liyuan Zhou, Aurélie Névéol, Cyril Grouin, Thierry Hamon, Pierre Zweigenbaum • Task 2: Joao Palotti, Guido Zuccon, Allan Hanbury, Mihai Lupu, Pavel Pecina 52
  • 54. Task 3a - Topic Generation Process (1) Discharge Medications: 1. Aspirin 81 mg Tablet, Delayed Release (E.C.) Sig: One (1) Tablet, Delayed Release (E.C.) PO DAILY (Daily). Disp:*30 Tablet, Delayed Release (E.C.)(s)* Refills:*0* 2. Docusate Sodium 100 mg Capsule Sig: One (1) Capsule PO BID (2 times a day). Disp:*60 Capsule(s)* Refills:*0* 3. Levothyroxine Sodium 200 mcg Tablet Sig: One (1) Tablet PO DAILY (Daily). Discharge Disposition: Extended Care Facility: [**Hospital 5805**] Manor - [**Location (un) 348**] Discharge Diagnosis: Coronary artery disease. s/p CABG post op atrial fibrillation 54
  • 55. Task 3a - Topic Generation Process (2) Discharge Medications: 1. Aspirin 81 mg Tablet, Delayed Release (E.C.) Sig: One (1) Tablet, Delayed Release (E.C.) PO DAILY (Daily). Disp:*30 Tablet, Delayed Release (E.C.)(s)* Refills:*0* 2. Docusate Sodium 100 mg Capsule Sig: One (1) Capsule PO BID (2 times a day). Disp:*60 Capsule(s)* Refills:*0* 3. Levothyroxine Sodium 200 mcg Tablet Sig: One (1) Tablet PO DAILY (Daily). Discharge Disposition: Extended Care Facility: [**Hospital 5805**] Manor - [**Location (un) 348**] Discharge Diagnosis: Coronary artery disease. s/p CABG post op atrial fibrillation 55
  • 56. Task 3a - Topic Generation Process (3) Discharge Medications: 1. Aspirin 81 mg Tablet, Delayed Release (E.C.) Sig: One (1) Tablet, Delayed Release (E.C.) PO DAILY (Daily). Disp:*30 Tablet, Delayed Release (E.C.)(s)* Refills:*0* 2. Docusate Sodium 100 mg Capsule Sig: One (1) Capsule PO BID (2 times a day). Disp:*60 Capsule(s)* Refills:*0* 3. Levothyroxine Sodium 200 mcg Tablet Sig: One (1) Tablet PO DAILY (Daily). Discharge Disposition: Extended Care Facility: [**Hospital 5805**] Manor - [**Location (un) 348**] Discharge Diagnosis: Coronary artery disease. s/p CABG post op atrial fibrillation What is coronary heart disease? 56