SlideShare a Scribd company logo
1 of 34
Download to read offline
1/27
Introduction Method Experiments Conclusions
Optimization Method for Weighting Explicit
and Latent Concepts in Clinical Decision
Support Queries
Saeid Balaneshin-kordan Alexander Kotov
Wayne State University
September 16, 2016
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
2/27
Introduction Method Experiments Conclusions
Introduction
Method
Experiments
Conclusions
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
3/27
Introduction Method Experiments Conclusions
Introduction
Method
Experiments
Conclusions
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
4/27
Introduction Method Experiments Conclusions
Queries and Explicit and Latent Concepts (Example)
§ Query: 33-year-old male presents with severe abdominal
pain one week after a bike accident, in which he sustained
abdominal trauma. He is hypotensive and tachycardic,
and imaging reveals a ruptured spleen and
intraperitoneal hemorrhage
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
4/27
Introduction Method Experiments Conclusions
Queries and Explicit and Latent Concepts (Example)
§ Query: 33-year-old male presents with severe abdominal
pain one week after a bike accident, in which he sustained
abdominal trauma. He is hypotensive and tachycardic,
and imaging reveals a ruptured spleen and
intraperitoneal hemorrhage
§ Explicit concepts: ”bike accident”, “abdominal trauma”,
“tachycardia”, “splenic rupture”, “intraperitoneal
hemorrhage”
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
4/27
Introduction Method Experiments Conclusions
Queries and Explicit and Latent Concepts (Example)
§ Query: 33-year-old male presents with severe abdominal
pain one week after a bike accident, in which he sustained
abdominal trauma. He is hypotensive and tachycardic,
and imaging reveals a ruptured spleen and
intraperitoneal hemorrhage
§ Explicit concepts: ”bike accident”, “abdominal trauma”,
“tachycardia”, “splenic rupture”, “intraperitoneal
hemorrhage”
§ Latent concepts: “splenic trauma”, “Injury of spleen”,
“Traffic accidents”
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
5/27
Introduction Method Experiments Conclusions
Sources of Latent Concepts for Query Expansion
§ Top retrieved documents
§ External domain-specific knowledge repositories (e.g.,
UMLS)
§ External general-purpose resources (e.g., Wikipedia)
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
6/27
Introduction Method Experiments Conclusions
Retrieval model
§ Retrieval score of document D with respect to query Q:
sc(Q, D) =
ÿ
TPTQ
ÿ
cPCT
λT(c)fT(c, D)
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
6/27
Introduction Method Experiments Conclusions
Retrieval model
§ Retrieval score of document D with respect to query Q:
sc(Q, D) =
ÿ
TPTQ
ÿ
cPCT
λT(c)fT(c, D)
§ Matching function of concept c in document D (two-stage
smoothing method [Zhai and Lafferty, SIGIR’02]):
fT(c, D) = log((1 ´ λ)
n(c, D) + µn(c,Col)
|Col|
|D| + µ
+ λ
n(c, Col)
|Col|
)
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
6/27
Introduction Method Experiments Conclusions
Retrieval model
§ Retrieval score of document D with respect to query Q:
sc(Q, D) =
ÿ
TPTQ
ÿ
cPCT
λT(c)fT(c, D)
§ Weight of concept c with type T:
λT(c) =
Nÿ
n=1
wn
φφn ,
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
6/27
Introduction Method Experiments Conclusions
Retrieval model
§ Retrieval score of document D with respect to query Q:
sc(Q, D) =
ÿ
TPTQ
ÿ
cPCT
λT(c)fT(c, D)
§ Weight of concept c with type T:
λT(c) =
Nÿ
n=1
wn
φφn ,
§ Objective: find the weights wn
φ that maximize infNDCG
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
7/27
Introduction Method Experiments Conclusions
infNDCG retrieval metric by varying the weight of
one of the features
0.325
0.330
0.335
0.340
0.0 0.1 0.2 0.3 0.4
wGI
infNDCG
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
8/27
Introduction Method Experiments Conclusions
Proposed Method
§ Represent verbose domain-specific queries using
weighted unigram, bigram and multi-term concepts in a
query itself, top retrieved documents and knowledge
bases.
§ Leverage Graduated Non-Convexity Optimization
(GNC) method to jointly determine the importance
weights for the query and expansion concepts depending
on their type and source.
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
9/27
Introduction Method Experiments Conclusions
Introduction
Method
Experiments
Conclusions
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
10/27
Introduction Method Experiments Conclusions
0.325
0.330
0.335
0.340
0.0 0.1 0.2 0.3 0.4
wGI
infNDCG
∆ 0.025
0.33
0.34
0.0 0.1 0.2 0.3 0.4
wGI
infNDCG
§ Sample infNDCG for the following
values of wφ:
ws,φ = [wφ,´M, . . . , wφ,0, . . . , wφ,M]
§ Using the fixed sampling interval
(∆wφ):
wφ,m = wφ,0 +m∆wφ, m P [´M, . . . , M]
§ Polynomial of degree K is used for
smoothing the objective function:
˜E(wφ,m) =
Kÿ
k=0
akmk
, m P [´M, . . . , M]
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
11/27
Introduction Method Experiments Conclusions
0.325
0.330
0.335
0.340
0.0 0.1 0.2 0.3 0.4
wGI
infNDCG
∆ 0.025
0.33
0.34
0.0 0.1 0.2 0.3 0.4
wGI
infNDCG
§ Cost Function (Mean Square Error
(MSE)):
εφ =
1
2M + 1
Mÿ
m=´M
(˜E(wφ,m)´E(wφ,m))2
§ Optimal a:
a = (JT
J)´1
JT
ws,φ
§ J is a Jacobian of the vector:
[J]m,k = (m´M)k
, m P [0, 2M], k P [0, K] .
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
12/27
Introduction Method Experiments Conclusions
The first three iterations...
∆ 0.025
0.33
0.34
0.0 0.1 0.2 0.3 0.4
wGI
infNDCG
(a) 1st Iteration
∆
0.338
0.340
0.342
0.344
0.00 0.01 0.02 0.03 0.04
wGI
infNDCG
(b) 2nd Iteration
∆w
0.3426
0.3429
0.3432
0.3435
0.023 0.024 0.025 0.026 0.027
wGI
infNDCG
(c) 3rd Iteration
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
13/27
Introduction Method Experiments Conclusions
Multi-variate optimization
§ Using Powell’s method, the multi-variate objective
function is decomposing into a series of univariate
objective functions to weight the n-th feature at the j-th
iteration:
En,j
(wn
φ) = E([ˆw1
φ, . . . , ˆwn´1
φ , wn
φ, ˆwn+1
φ , . . . , ˆwN
φ])
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
14/27
Introduction Method Experiments Conclusions
Algorithm
1: Identify explicit and latent concepts
2: Randomly initialize the feature weights vector (wφ)
3: for j = 1 : jmax do
4: Randomly shuffle wφ
5: for n = 1 : N do
6: for each sampling policy do
7: Sample En,j(wn
φ)
8: Obtain ˜En,j(wn
φ,m)
9: Obtain the optimum point ˆwn
φ
10: Update n-th element of wφ by ˆwn
φ
11: end for
12: end for
13: if Convergence then
14: Break
15: end if
16: end for
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
15/27
Introduction Method Experiments Conclusions
Introduction
Method
Experiments
Conclusions
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
16/27
Introduction Method Experiments Conclusions
Experimental Setup
§ All retrieval methods were implemented using Indri 5.9 IR toolkit
§ We used INQUERY stoplist and Porter stemmer
§ Very frequent terms (that occur in more than 15% of documents) and very rare
terms (that occur in less than 5 documents) were ignored for estimation of
translation models, LDA and NMF.
§ We used the implementation of NMF from scikit-learn 17.1 (Double Singular
Value Decomposition for non-random initialization of factors and Coordinate
Descent solver for optimization)
§ Gibbs sampler for posterior inference of LDA parameters was run for 200
iterations, and the convergence threshold for NMF was set to 10´5
§ Word embeddings were obtained by using word2vec (version 0.1c) tool 3 (by
using Skip-gram architecture for its training)
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
17/27
Introduction Method Experiments Conclusions
Explicit and Latent Query Concept Types
Concept Type Concept Source(s) Concept Representation Concept Extraction
QU Query unigrams all query unigrams
QOB Query ordered bigrams all query bigrams
QUB Query unordered bigrams all query bigrams
QUU Query, UMLS unigrams MetaMap
QUOB Query, UMLS ordered bigrams MetaMap
QUUB Query, UMLS unordered bigrams MetaMap
QDU Query, Wikipedia unigrams health-relatedness measure
QDOB Query, Wikipedia ordered bigrams health-relatedness measure
QDUB Query, Wikipedia unordered bigrams health-relatedness measure
TU Top-docs unigrams direct identification
TOB Top-docs ordered bigrams direct identification
TUB Top-docs unordered bigrams direct identification
TUU Top-docs, UMLS unigrams MetaMap
TUOB Top-docs, UMLS ordered bigrams MetaMap
TUUB Top-docs, UMLS unordered bigrams MetaMap
TDU Top-docs, Wikipedia unigrams health-relatedness measure
TDOB Top-docs, Wikipedia ordered bigrams health-relatedness measure
TDUB Top-docs, Wikipedia unordered bigrams health-relatedness measure
UU UMLS unigrams UMLS relationships
UOB UMLS ordered bigrams UMLS relationships
UUB UMLS unordered bigrams UMLS relationships
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
18/27
Introduction Method Experiments Conclusions
Features (when concept sources are Query and UMLS)
§ TF-IDF of concept c in the collection,
§ Number of top retrieved documents containing concept c,
§ Sum of retrieval scores of top-ranked documents
containing concept c,
§ Average/Maximum collection co-occurrence of concept c
with other concepts in the query,
§ Average/Maximum co-occurrence of concept c with other
query concepts in top retrieved documents,
§ Does concept c have a UMLS semantic type that is effective
for medical query expansion?
§ Node degree of concept c in the UMLS concept graph,
§ Average distance between concept c in the UMLS concept
graph and other query, top document and related UMLS
concepts identified for a query.
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
19/27
Introduction Method Experiments Conclusions
Considered Semantic Types of UMLS concepts
Obtained using backward feature elimination process from
[Limsopatham et al., OAIR’13]:
Semantic Group Semantic Type
Chemicals & Drugs Clinical Drug
Disorders Disease or Syndrome
Disorders Injury or Poisoning
Disorders Sign or Symptom
Procedures Therapeutic or Preventive Procedure
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
20/27
Introduction Method Experiments Conclusions
Number of Top-docs and number of their Concepts
●
● ● ●
●
● ● ● ● ● ● ● ●
● ●
●
● ●
0.26
0.28
0.30
0.32
5 10 15
Number of Top−docs
infNDCG
● ● ●
● ● ●
● ●
● ● ● ● ● ● ● ● ● ●
0.28
0.29
0.30
0.31
0.32
0.33
5 10 15
Number of Concepts
infNDCG
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
21/27
Introduction Method Experiments Conclusions
Performance Comparison
Query set TREC 2014 CDS track TREC 2015 CDS track
Method infNDCG infAP P@5 infNDCG infAP P@5
Two-Stage 0.1945 0.0493 0.3533 0.2110 0.0449 0.4200
Wiki-Orig 0.2069 0.0550 0.3533 0.2193 0.0457 0.4133
UMLS-Orig 0.2074 0.0569 0.3867 0.2206 0.0478 0.4400
UMLS-TD˚ 0.2724 0.0810 0.4133 0.2503 0.0614 0.4667
PQE 0.2796 0.0873 0.4733 0.2792 0.0762 0.4400
Wiki-TD˚ 0.2883 0.0944 0.4600 0.2519 0.0633 0.4600
TREC best 0.2631 0.0757 0.4067 0.2928 0.0777 0.4467
INTGR-LS 0.3114 0.0993 0.4867 0.2987 0.0792 0.4800
INTGR 0.3401 0.1229 0.5267 0.3135 0.0873 0.5200
§ INTGR significantly out- performs all of the best performing baselines in terms
of all retrieval metrics.
§ Using graduated non-convexity as a uni-variate optimization method results in:
§ 5-9% improvement of retrieval accuracy in terms of infNDCG,
§ 10-23% improve- ment in terms of infAP and
§ 8% improvement in terms of P@5 on different query sets.
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
22/27
Introduction Method Experiments Conclusions
Effectiveness of different type of knowledge bases
Table : TREC 2014 CDS track topics
Method infNDCG infAP P@5
INTGR using no knowledge bases 0.2673 0.0875 0.4601
INTGR using only Wikipedia 0.2975 0.0936 0.4533
(11.30%) (6.97%) (-1.47%)
INTGR using only UMLS 0.3309 0.1170 0.5200
(23.79%) (33.71%) (13.02%)
INTGR using UMLS and Wikipedia 0.3401 0.1229 0.5267
(27.23%) (40.46%) (14.47%)
§ the methods based on semantic concepts, UMLS is a better choice than
Wikipedia with respect to all metrics, if only one knowledge repository is used
§ combining both knowledge bases results in better retrieval accuracy than using
any one of them individually
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
22/27
Introduction Method Experiments Conclusions
Effectiveness of different type of knowledge bases
Table : TREC 2015 CDS track topics
Method infNDCG infAP P@5
INTGR using no knowledge bases 0.2771 0.0758 0.4633
INTGR using only Wikipedia 0.2954 0.0779 0.4667
(6.60%) (2.77%) (0.09%)
INTGR using only UMLS 0.3012 0.0786 0.5033
(8.67%) (3.93%) (7.93%)
INTGR using UMLS and Wikipedia 0.3135 0.0873 0.5200
(13.14%) (15.17%) (11.52%)
§ the methods based on semantic concepts, UMLS is a better choice than
Wikipedia with respect to all metrics, if only one knowledge repository is used
§ combining both knowledge bases results in better retrieval accuracy than using
any one of them individually
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
23/27
Introduction Method Experiments Conclusions
Comparison of INTGR with the baselines in terms of
P@k for k ď 10:
0.3
0.4
0.5
0.6
1 2 3 4 5 6 7 8 9 10
k
P@k
(d) TREC 2014 CDS track topics
0.40
0.45
0.50
0.55
1 2 3 4 5 6 7 8 9 10
k
P@k
(e) TREC 2015 CDS track topics
§ INTGR has slightly lower accuracy improvement over its best-performing
baseline and Two-Stage on the testing query set than on the training query set
§ Improvement that INTGR achieves over Two-Stage is much higher than the
improvement of the best performing base- line over Two-Stage.
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
24/27
Introduction Method Experiments Conclusions
Topic-level differences of the infNDCG values
16
27
15
10
19
24
29
9
26
4
2
25
12
18
23
21
20
8
6
11
22
3
13
17
28
30
1
5
7
14
mean = 0.0518
σ = 0.12
−0.2
0.0
0.2
0.4
Query ID
infNDCG−difference
(f) TREC 2014 CDS track topics
§ infNDCG of INTGR is greater than that of Wiki-TD‹ on 67% of the queries
§ Highest Improvement: 28-year-old female with neck pain and left arm numbness 3 weeks after
working with stray animals. Physical exam initially remarkable for slight left arm tremor and spasticity.
Three days later she presented with significant arm spasticity, diaphoresis, agitation,
difficulty swallowing, and hydrophobia.
§ Lowest Degradation: 85-year-old man who was in a car accident 3 weeks ago, now with 3 days of
progressively decreasing level of consciousness and impaired ability to perform activities of daily living.
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
24/27
Introduction Method Experiments Conclusions
Topic-level differences of the infNDCG values
6
15
22
28
30
4
16
23
7
12
3
9
27
18
26
17
10
29
24
19
25
2
5
1
20
11
13
21
14
8
mean = 0.0345
σ = 0.0734
−0.2
0.0
0.2
0.4
Query ID
infNDCG−difference
(g) TREC 2015 CDS track topics
§ infNDCG of INTGR is greater than that of PQE on 73% of the queries
§ Highest Improvement:A 46-year-old woman with sweaty hands, exophthalmia, and weight loss
despite increased eating.
§ Lowest Degradation:An 89-year-old man with progressive change in personality, poor memory, and
myoclonic jerks.
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
25/27
Introduction Method Experiments Conclusions
Introduction
Method
Experiments
Conclusions
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
26/27
Introduction Method Experiments Conclusions
Conclusions
§ We proposed a method to represent CDS queries using
explicit concepts from the original query and the latent
concepts from the top retrieved documents and knowledge
bases
§ We proposed the features to individually weigh each
query concept depending on its type and source
§ We proposed to use graduated optimization method to
directly optimize the parameters of the concept based
retrieval model with respect to the target retrieval metric
§ Proposed method significantly outperforms
state-of-the-art baselines as well as best systems in CDS
track of TREC 2014 and 2015
Balaneshin, Kotov Wayne State University
Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
Many Thanks to the ACM SIGIR Student Travel Grant!
27/27

More Related Content

Viewers also liked

Pathway for skin penetration.ppt(1)
Pathway for skin penetration.ppt(1)Pathway for skin penetration.ppt(1)
Pathway for skin penetration.ppt(1)
Ghazwa Shawash
 
Optimization techniques in pharmaceutical formulation and processing
Optimization techniques in pharmaceutical formulation and processingOptimization techniques in pharmaceutical formulation and processing
Optimization techniques in pharmaceutical formulation and processing
Reshma Fathima .K
 
Penetration enhancers Used in transdermal drug delivry
Penetration enhancers Used in transdermal drug delivryPenetration enhancers Used in transdermal drug delivry
Penetration enhancers Used in transdermal drug delivry
MalLiKaRjunA yadav
 
Optimization techniques in pharmaceutical processing
Optimization techniques in pharmaceutical processingOptimization techniques in pharmaceutical processing
Optimization techniques in pharmaceutical processing
Naval Garg
 
Transdermal drug delivery systems
Transdermal drug delivery systemsTransdermal drug delivery systems
Transdermal drug delivery systems
Sonam Gandhi
 
Transdermal drug delivery system
Transdermal drug delivery systemTransdermal drug delivery system
Transdermal drug delivery system
Danish Kurien
 

Viewers also liked (14)

Optimization technique
Optimization techniqueOptimization technique
Optimization technique
 
Transferosomes
TransferosomesTransferosomes
Transferosomes
 
Pathway for skin penetration.ppt(1)
Pathway for skin penetration.ppt(1)Pathway for skin penetration.ppt(1)
Pathway for skin penetration.ppt(1)
 
Optimization techniques in pharmaceutical formulation and processing
Optimization techniques in pharmaceutical formulation and processingOptimization techniques in pharmaceutical formulation and processing
Optimization techniques in pharmaceutical formulation and processing
 
topical therapy in dermatology
topical therapy in dermatologytopical therapy in dermatology
topical therapy in dermatology
 
Optimization Methods
Optimization MethodsOptimization Methods
Optimization Methods
 
TDDS
TDDSTDDS
TDDS
 
Penetration enhancers Used in transdermal drug delivry
Penetration enhancers Used in transdermal drug delivryPenetration enhancers Used in transdermal drug delivry
Penetration enhancers Used in transdermal drug delivry
 
Optimization techniques in pharmaceutical processing
Optimization techniques in pharmaceutical processingOptimization techniques in pharmaceutical processing
Optimization techniques in pharmaceutical processing
 
Extraction in pharmaceutics
Extraction in pharmaceuticsExtraction in pharmaceutics
Extraction in pharmaceutics
 
OPTIMIZATION IN PHARMACEUTICS,FORMULATION & PROCESSING
OPTIMIZATION IN PHARMACEUTICS,FORMULATION & PROCESSINGOPTIMIZATION IN PHARMACEUTICS,FORMULATION & PROCESSING
OPTIMIZATION IN PHARMACEUTICS,FORMULATION & PROCESSING
 
Transdermal drug delivery systems
Transdermal drug delivery systemsTransdermal drug delivery systems
Transdermal drug delivery systems
 
Transdermal drug delivery system
Transdermal drug delivery systemTransdermal drug delivery system
Transdermal drug delivery system
 
Penetration enhancer with their examples
Penetration enhancer with their examplesPenetration enhancer with their examples
Penetration enhancer with their examples
 

Similar to Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries

East ugm-2012-presentation-east-future-mehta
East ugm-2012-presentation-east-future-mehtaEast ugm-2012-presentation-east-future-mehta
East ugm-2012-presentation-east-future-mehta
Cytel
 
Eugm 2012 mehta - future plans for east - 2012 eugm
Eugm 2012   mehta - future plans for east - 2012 eugmEugm 2012   mehta - future plans for east - 2012 eugm
Eugm 2012 mehta - future plans for east - 2012 eugm
Cytel USA
 
289161214 vietnam-2-64-tổ-chức-lớp-viết-bao-khoa-học-y-khoa-đăng-tren-tạp-chi...
289161214 vietnam-2-64-tổ-chức-lớp-viết-bao-khoa-học-y-khoa-đăng-tren-tạp-chi...289161214 vietnam-2-64-tổ-chức-lớp-viết-bao-khoa-học-y-khoa-đăng-tren-tạp-chi...
289161214 vietnam-2-64-tổ-chức-lớp-viết-bao-khoa-học-y-khoa-đăng-tren-tạp-chi...
trangphan159
 
DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...
DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...
DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...
Anjani Dhrangadhariya
 
Innovative Sample Size Methods For Clinical Trials
Innovative Sample Size Methods For Clinical Trials Innovative Sample Size Methods For Clinical Trials
Innovative Sample Size Methods For Clinical Trials
nQuery
 
1) The path length from A to B in the following graph is .docx
1) The path length from A to B in the following graph is .docx1) The path length from A to B in the following graph is .docx
1) The path length from A to B in the following graph is .docx
monicafrancis71118
 
ALERT Presentation - Ching - IPSSW 2014
ALERT Presentation - Ching - IPSSW 2014ALERT Presentation - Ching - IPSSW 2014
ALERT Presentation - Ching - IPSSW 2014
INSPIRE_Network
 

Similar to Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries (20)

East ugm-2012-presentation-east-future-mehta
East ugm-2012-presentation-east-future-mehtaEast ugm-2012-presentation-east-future-mehta
East ugm-2012-presentation-east-future-mehta
 
Eugm 2012 mehta - future plans for east - 2012 eugm
Eugm 2012   mehta - future plans for east - 2012 eugmEugm 2012   mehta - future plans for east - 2012 eugm
Eugm 2012 mehta - future plans for east - 2012 eugm
 
PSA pattern to predict CRPC
PSA pattern to predict CRPCPSA pattern to predict CRPC
PSA pattern to predict CRPC
 
289161214 vietnam-2-64-tổ-chức-lớp-viết-bao-khoa-học-y-khoa-đăng-tren-tạp-chi...
289161214 vietnam-2-64-tổ-chức-lớp-viết-bao-khoa-học-y-khoa-đăng-tren-tạp-chi...289161214 vietnam-2-64-tổ-chức-lớp-viết-bao-khoa-học-y-khoa-đăng-tren-tạp-chi...
289161214 vietnam-2-64-tổ-chức-lớp-viết-bao-khoa-học-y-khoa-đăng-tren-tạp-chi...
 
DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...
DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...
DISTANT-CTO: A Zero Cost, Distantly Supervised Approach to Improve Low-Resour...
 
Training load and injuries in football- lessons from research and practise
Training load and injuries in football- lessons from research and practiseTraining load and injuries in football- lessons from research and practise
Training load and injuries in football- lessons from research and practise
 
Innovative Sample Size Methods For Clinical Trials
Innovative Sample Size Methods For Clinical Trials Innovative Sample Size Methods For Clinical Trials
Innovative Sample Size Methods For Clinical Trials
 
2013 machine learning_choih
2013 machine learning_choih2013 machine learning_choih
2013 machine learning_choih
 
Master's Project Defense
Master's Project DefenseMaster's Project Defense
Master's Project Defense
 
Feedbackdriven radiologyreportretrieval ichi2015-v2
Feedbackdriven radiologyreportretrieval ichi2015-v2Feedbackdriven radiologyreportretrieval ichi2015-v2
Feedbackdriven radiologyreportretrieval ichi2015-v2
 
1) The path length from A to B in the following graph is .docx
1) The path length from A to B in the following graph is .docx1) The path length from A to B in the following graph is .docx
1) The path length from A to B in the following graph is .docx
 
Re-analysis of the Cochrane Library data and heterogeneity challenges
Re-analysis of the Cochrane Library data and heterogeneity challengesRe-analysis of the Cochrane Library data and heterogeneity challenges
Re-analysis of the Cochrane Library data and heterogeneity challenges
 
ALERT Presentation - Ching - IPSSW 2014
ALERT Presentation - Ching - IPSSW 2014ALERT Presentation - Ching - IPSSW 2014
ALERT Presentation - Ching - IPSSW 2014
 
Tests of Knowledge: How Can They Contribute to Maintenance of Certification ...
Tests of Knowledge: How Can They Contribute to Maintenance of Certification ...Tests of Knowledge: How Can They Contribute to Maintenance of Certification ...
Tests of Knowledge: How Can They Contribute to Maintenance of Certification ...
 
Health Data Science Seminar Series
Health Data Science Seminar SeriesHealth Data Science Seminar Series
Health Data Science Seminar Series
 
From Experimental to Applied Predictive Analytics on Big Data - Milan Vukicevic
 From Experimental to Applied Predictive Analytics on Big Data - Milan Vukicevic From Experimental to Applied Predictive Analytics on Big Data - Milan Vukicevic
From Experimental to Applied Predictive Analytics on Big Data - Milan Vukicevic
 
Basics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsBasics of Data Analysis in Bioinformatics
Basics of Data Analysis in Bioinformatics
 
Critical Appraisal - Quantitative SS.pptx
Critical Appraisal - Quantitative SS.pptxCritical Appraisal - Quantitative SS.pptx
Critical Appraisal - Quantitative SS.pptx
 
How Testing Standardization Reduced Charges for Solid Organ Transplant Patien...
How Testing Standardization Reduced Charges for Solid Organ Transplant Patien...How Testing Standardization Reduced Charges for Solid Organ Transplant Patien...
How Testing Standardization Reduced Charges for Solid Organ Transplant Patien...
 
HANDOUT_MTP015_Art_Physical_Diagnosis.pdf
HANDOUT_MTP015_Art_Physical_Diagnosis.pdfHANDOUT_MTP015_Art_Physical_Diagnosis.pdf
HANDOUT_MTP015_Art_Physical_Diagnosis.pdf
 

Recently uploaded

Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
Silpa
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
Silpa
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Silpa
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
ANSARKHAN96
 

Recently uploaded (20)

GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRLGwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
Gwalior ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Gwalior ESCORT SERVICE❤CALL GIRL
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 

Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries

  • 1. 1/27 Introduction Method Experiments Conclusions Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries Saeid Balaneshin-kordan Alexander Kotov Wayne State University September 16, 2016 Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 2. 2/27 Introduction Method Experiments Conclusions Introduction Method Experiments Conclusions Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 3. 3/27 Introduction Method Experiments Conclusions Introduction Method Experiments Conclusions Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 4. 4/27 Introduction Method Experiments Conclusions Queries and Explicit and Latent Concepts (Example) § Query: 33-year-old male presents with severe abdominal pain one week after a bike accident, in which he sustained abdominal trauma. He is hypotensive and tachycardic, and imaging reveals a ruptured spleen and intraperitoneal hemorrhage Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 5. 4/27 Introduction Method Experiments Conclusions Queries and Explicit and Latent Concepts (Example) § Query: 33-year-old male presents with severe abdominal pain one week after a bike accident, in which he sustained abdominal trauma. He is hypotensive and tachycardic, and imaging reveals a ruptured spleen and intraperitoneal hemorrhage § Explicit concepts: ”bike accident”, “abdominal trauma”, “tachycardia”, “splenic rupture”, “intraperitoneal hemorrhage” Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 6. 4/27 Introduction Method Experiments Conclusions Queries and Explicit and Latent Concepts (Example) § Query: 33-year-old male presents with severe abdominal pain one week after a bike accident, in which he sustained abdominal trauma. He is hypotensive and tachycardic, and imaging reveals a ruptured spleen and intraperitoneal hemorrhage § Explicit concepts: ”bike accident”, “abdominal trauma”, “tachycardia”, “splenic rupture”, “intraperitoneal hemorrhage” § Latent concepts: “splenic trauma”, “Injury of spleen”, “Traffic accidents” Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 7. 5/27 Introduction Method Experiments Conclusions Sources of Latent Concepts for Query Expansion § Top retrieved documents § External domain-specific knowledge repositories (e.g., UMLS) § External general-purpose resources (e.g., Wikipedia) Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 8. 6/27 Introduction Method Experiments Conclusions Retrieval model § Retrieval score of document D with respect to query Q: sc(Q, D) = ÿ TPTQ ÿ cPCT λT(c)fT(c, D) Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 9. 6/27 Introduction Method Experiments Conclusions Retrieval model § Retrieval score of document D with respect to query Q: sc(Q, D) = ÿ TPTQ ÿ cPCT λT(c)fT(c, D) § Matching function of concept c in document D (two-stage smoothing method [Zhai and Lafferty, SIGIR’02]): fT(c, D) = log((1 ´ λ) n(c, D) + µn(c,Col) |Col| |D| + µ + λ n(c, Col) |Col| ) Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 10. 6/27 Introduction Method Experiments Conclusions Retrieval model § Retrieval score of document D with respect to query Q: sc(Q, D) = ÿ TPTQ ÿ cPCT λT(c)fT(c, D) § Weight of concept c with type T: λT(c) = Nÿ n=1 wn φφn , Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 11. 6/27 Introduction Method Experiments Conclusions Retrieval model § Retrieval score of document D with respect to query Q: sc(Q, D) = ÿ TPTQ ÿ cPCT λT(c)fT(c, D) § Weight of concept c with type T: λT(c) = Nÿ n=1 wn φφn , § Objective: find the weights wn φ that maximize infNDCG Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 12. 7/27 Introduction Method Experiments Conclusions infNDCG retrieval metric by varying the weight of one of the features 0.325 0.330 0.335 0.340 0.0 0.1 0.2 0.3 0.4 wGI infNDCG Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 13. 8/27 Introduction Method Experiments Conclusions Proposed Method § Represent verbose domain-specific queries using weighted unigram, bigram and multi-term concepts in a query itself, top retrieved documents and knowledge bases. § Leverage Graduated Non-Convexity Optimization (GNC) method to jointly determine the importance weights for the query and expansion concepts depending on their type and source. Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 14. 9/27 Introduction Method Experiments Conclusions Introduction Method Experiments Conclusions Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 15. 10/27 Introduction Method Experiments Conclusions 0.325 0.330 0.335 0.340 0.0 0.1 0.2 0.3 0.4 wGI infNDCG ∆ 0.025 0.33 0.34 0.0 0.1 0.2 0.3 0.4 wGI infNDCG § Sample infNDCG for the following values of wφ: ws,φ = [wφ,´M, . . . , wφ,0, . . . , wφ,M] § Using the fixed sampling interval (∆wφ): wφ,m = wφ,0 +m∆wφ, m P [´M, . . . , M] § Polynomial of degree K is used for smoothing the objective function: ˜E(wφ,m) = Kÿ k=0 akmk , m P [´M, . . . , M] Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 16. 11/27 Introduction Method Experiments Conclusions 0.325 0.330 0.335 0.340 0.0 0.1 0.2 0.3 0.4 wGI infNDCG ∆ 0.025 0.33 0.34 0.0 0.1 0.2 0.3 0.4 wGI infNDCG § Cost Function (Mean Square Error (MSE)): εφ = 1 2M + 1 Mÿ m=´M (˜E(wφ,m)´E(wφ,m))2 § Optimal a: a = (JT J)´1 JT ws,φ § J is a Jacobian of the vector: [J]m,k = (m´M)k , m P [0, 2M], k P [0, K] . Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 17. 12/27 Introduction Method Experiments Conclusions The first three iterations... ∆ 0.025 0.33 0.34 0.0 0.1 0.2 0.3 0.4 wGI infNDCG (a) 1st Iteration ∆ 0.338 0.340 0.342 0.344 0.00 0.01 0.02 0.03 0.04 wGI infNDCG (b) 2nd Iteration ∆w 0.3426 0.3429 0.3432 0.3435 0.023 0.024 0.025 0.026 0.027 wGI infNDCG (c) 3rd Iteration Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 18. 13/27 Introduction Method Experiments Conclusions Multi-variate optimization § Using Powell’s method, the multi-variate objective function is decomposing into a series of univariate objective functions to weight the n-th feature at the j-th iteration: En,j (wn φ) = E([ˆw1 φ, . . . , ˆwn´1 φ , wn φ, ˆwn+1 φ , . . . , ˆwN φ]) Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 19. 14/27 Introduction Method Experiments Conclusions Algorithm 1: Identify explicit and latent concepts 2: Randomly initialize the feature weights vector (wφ) 3: for j = 1 : jmax do 4: Randomly shuffle wφ 5: for n = 1 : N do 6: for each sampling policy do 7: Sample En,j(wn φ) 8: Obtain ˜En,j(wn φ,m) 9: Obtain the optimum point ˆwn φ 10: Update n-th element of wφ by ˆwn φ 11: end for 12: end for 13: if Convergence then 14: Break 15: end if 16: end for Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 20. 15/27 Introduction Method Experiments Conclusions Introduction Method Experiments Conclusions Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 21. 16/27 Introduction Method Experiments Conclusions Experimental Setup § All retrieval methods were implemented using Indri 5.9 IR toolkit § We used INQUERY stoplist and Porter stemmer § Very frequent terms (that occur in more than 15% of documents) and very rare terms (that occur in less than 5 documents) were ignored for estimation of translation models, LDA and NMF. § We used the implementation of NMF from scikit-learn 17.1 (Double Singular Value Decomposition for non-random initialization of factors and Coordinate Descent solver for optimization) § Gibbs sampler for posterior inference of LDA parameters was run for 200 iterations, and the convergence threshold for NMF was set to 10´5 § Word embeddings were obtained by using word2vec (version 0.1c) tool 3 (by using Skip-gram architecture for its training) Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 22. 17/27 Introduction Method Experiments Conclusions Explicit and Latent Query Concept Types Concept Type Concept Source(s) Concept Representation Concept Extraction QU Query unigrams all query unigrams QOB Query ordered bigrams all query bigrams QUB Query unordered bigrams all query bigrams QUU Query, UMLS unigrams MetaMap QUOB Query, UMLS ordered bigrams MetaMap QUUB Query, UMLS unordered bigrams MetaMap QDU Query, Wikipedia unigrams health-relatedness measure QDOB Query, Wikipedia ordered bigrams health-relatedness measure QDUB Query, Wikipedia unordered bigrams health-relatedness measure TU Top-docs unigrams direct identification TOB Top-docs ordered bigrams direct identification TUB Top-docs unordered bigrams direct identification TUU Top-docs, UMLS unigrams MetaMap TUOB Top-docs, UMLS ordered bigrams MetaMap TUUB Top-docs, UMLS unordered bigrams MetaMap TDU Top-docs, Wikipedia unigrams health-relatedness measure TDOB Top-docs, Wikipedia ordered bigrams health-relatedness measure TDUB Top-docs, Wikipedia unordered bigrams health-relatedness measure UU UMLS unigrams UMLS relationships UOB UMLS ordered bigrams UMLS relationships UUB UMLS unordered bigrams UMLS relationships Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 23. 18/27 Introduction Method Experiments Conclusions Features (when concept sources are Query and UMLS) § TF-IDF of concept c in the collection, § Number of top retrieved documents containing concept c, § Sum of retrieval scores of top-ranked documents containing concept c, § Average/Maximum collection co-occurrence of concept c with other concepts in the query, § Average/Maximum co-occurrence of concept c with other query concepts in top retrieved documents, § Does concept c have a UMLS semantic type that is effective for medical query expansion? § Node degree of concept c in the UMLS concept graph, § Average distance between concept c in the UMLS concept graph and other query, top document and related UMLS concepts identified for a query. Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 24. 19/27 Introduction Method Experiments Conclusions Considered Semantic Types of UMLS concepts Obtained using backward feature elimination process from [Limsopatham et al., OAIR’13]: Semantic Group Semantic Type Chemicals & Drugs Clinical Drug Disorders Disease or Syndrome Disorders Injury or Poisoning Disorders Sign or Symptom Procedures Therapeutic or Preventive Procedure Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 25. 20/27 Introduction Method Experiments Conclusions Number of Top-docs and number of their Concepts ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.26 0.28 0.30 0.32 5 10 15 Number of Top−docs infNDCG ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.28 0.29 0.30 0.31 0.32 0.33 5 10 15 Number of Concepts infNDCG Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 26. 21/27 Introduction Method Experiments Conclusions Performance Comparison Query set TREC 2014 CDS track TREC 2015 CDS track Method infNDCG infAP P@5 infNDCG infAP P@5 Two-Stage 0.1945 0.0493 0.3533 0.2110 0.0449 0.4200 Wiki-Orig 0.2069 0.0550 0.3533 0.2193 0.0457 0.4133 UMLS-Orig 0.2074 0.0569 0.3867 0.2206 0.0478 0.4400 UMLS-TD˚ 0.2724 0.0810 0.4133 0.2503 0.0614 0.4667 PQE 0.2796 0.0873 0.4733 0.2792 0.0762 0.4400 Wiki-TD˚ 0.2883 0.0944 0.4600 0.2519 0.0633 0.4600 TREC best 0.2631 0.0757 0.4067 0.2928 0.0777 0.4467 INTGR-LS 0.3114 0.0993 0.4867 0.2987 0.0792 0.4800 INTGR 0.3401 0.1229 0.5267 0.3135 0.0873 0.5200 § INTGR significantly out- performs all of the best performing baselines in terms of all retrieval metrics. § Using graduated non-convexity as a uni-variate optimization method results in: § 5-9% improvement of retrieval accuracy in terms of infNDCG, § 10-23% improve- ment in terms of infAP and § 8% improvement in terms of P@5 on different query sets. Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 27. 22/27 Introduction Method Experiments Conclusions Effectiveness of different type of knowledge bases Table : TREC 2014 CDS track topics Method infNDCG infAP P@5 INTGR using no knowledge bases 0.2673 0.0875 0.4601 INTGR using only Wikipedia 0.2975 0.0936 0.4533 (11.30%) (6.97%) (-1.47%) INTGR using only UMLS 0.3309 0.1170 0.5200 (23.79%) (33.71%) (13.02%) INTGR using UMLS and Wikipedia 0.3401 0.1229 0.5267 (27.23%) (40.46%) (14.47%) § the methods based on semantic concepts, UMLS is a better choice than Wikipedia with respect to all metrics, if only one knowledge repository is used § combining both knowledge bases results in better retrieval accuracy than using any one of them individually Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 28. 22/27 Introduction Method Experiments Conclusions Effectiveness of different type of knowledge bases Table : TREC 2015 CDS track topics Method infNDCG infAP P@5 INTGR using no knowledge bases 0.2771 0.0758 0.4633 INTGR using only Wikipedia 0.2954 0.0779 0.4667 (6.60%) (2.77%) (0.09%) INTGR using only UMLS 0.3012 0.0786 0.5033 (8.67%) (3.93%) (7.93%) INTGR using UMLS and Wikipedia 0.3135 0.0873 0.5200 (13.14%) (15.17%) (11.52%) § the methods based on semantic concepts, UMLS is a better choice than Wikipedia with respect to all metrics, if only one knowledge repository is used § combining both knowledge bases results in better retrieval accuracy than using any one of them individually Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 29. 23/27 Introduction Method Experiments Conclusions Comparison of INTGR with the baselines in terms of P@k for k ď 10: 0.3 0.4 0.5 0.6 1 2 3 4 5 6 7 8 9 10 k P@k (d) TREC 2014 CDS track topics 0.40 0.45 0.50 0.55 1 2 3 4 5 6 7 8 9 10 k P@k (e) TREC 2015 CDS track topics § INTGR has slightly lower accuracy improvement over its best-performing baseline and Two-Stage on the testing query set than on the training query set § Improvement that INTGR achieves over Two-Stage is much higher than the improvement of the best performing base- line over Two-Stage. Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 30. 24/27 Introduction Method Experiments Conclusions Topic-level differences of the infNDCG values 16 27 15 10 19 24 29 9 26 4 2 25 12 18 23 21 20 8 6 11 22 3 13 17 28 30 1 5 7 14 mean = 0.0518 σ = 0.12 −0.2 0.0 0.2 0.4 Query ID infNDCG−difference (f) TREC 2014 CDS track topics § infNDCG of INTGR is greater than that of Wiki-TD‹ on 67% of the queries § Highest Improvement: 28-year-old female with neck pain and left arm numbness 3 weeks after working with stray animals. Physical exam initially remarkable for slight left arm tremor and spasticity. Three days later she presented with significant arm spasticity, diaphoresis, agitation, difficulty swallowing, and hydrophobia. § Lowest Degradation: 85-year-old man who was in a car accident 3 weeks ago, now with 3 days of progressively decreasing level of consciousness and impaired ability to perform activities of daily living. Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 31. 24/27 Introduction Method Experiments Conclusions Topic-level differences of the infNDCG values 6 15 22 28 30 4 16 23 7 12 3 9 27 18 26 17 10 29 24 19 25 2 5 1 20 11 13 21 14 8 mean = 0.0345 σ = 0.0734 −0.2 0.0 0.2 0.4 Query ID infNDCG−difference (g) TREC 2015 CDS track topics § infNDCG of INTGR is greater than that of PQE on 73% of the queries § Highest Improvement:A 46-year-old woman with sweaty hands, exophthalmia, and weight loss despite increased eating. § Lowest Degradation:An 89-year-old man with progressive change in personality, poor memory, and myoclonic jerks. Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 32. 25/27 Introduction Method Experiments Conclusions Introduction Method Experiments Conclusions Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 33. 26/27 Introduction Method Experiments Conclusions Conclusions § We proposed a method to represent CDS queries using explicit concepts from the original query and the latent concepts from the top retrieved documents and knowledge bases § We proposed the features to individually weigh each query concept depending on its type and source § We proposed to use graduated optimization method to directly optimize the parameters of the concept based retrieval model with respect to the target retrieval metric § Proposed method significantly outperforms state-of-the-art baselines as well as best systems in CDS track of TREC 2014 and 2015 Balaneshin, Kotov Wayne State University Optimization Method for Weighting Explicit and Latent Concepts in Clinical Decision Support Queries
  • 34. Many Thanks to the ACM SIGIR Student Travel Grant! 27/27