TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
Learning To Rank User Queries to Detect Search Tasks
1. Learning to Rank User Queries to Detect Search Tasks
Learning to Rank User Queries
to Detect Search Tasks
Claudio Lucchese1, Franco Maria Nardini1,
Salvatore Orlando2, Gabriele Tolomei3
1 ISTI-CNR, Pisa, Italy
2 Universit´a Ca’ Foscari Venezia, Italy
3 Yahoo Labs, London, UK
2. Learning to Rank User Queries to Detect Search Tasks
Introduction
The Evolution of Web Search
An increasing number of user searches are part of complex
patterns.
Complex search patterns are often composed of several,
multi-term, interleaved queries spread across many sessions.
User information needs are getting harder to understand and
satisfy.
Search Task: Clusters of queries with the same latent information
need from a real-world search engine log.
3. Learning to Rank User Queries to Detect Search Tasks
Introduction
Complex Search Patterns: AOL 2006
Queries within
short-time sessions
are part of different
complex tasks.
Each complex task
spans across several
sessions.
4. Learning to Rank User Queries to Detect Search Tasks
Related Work
Related Work - I
Jones and Klinkner [3]
First high-level analysis of user search tasks.
Hierarchical search:
Flat query streams can be structured as complex search
missions linked to each other.
Each mission in turn contains simpler search goals.
Design a binary classifier which is able to predict whether
two queries belong to the same goal or not.
5. Learning to Rank User Queries to Detect Search Tasks
Related Work
Related Work - II
Lucchese et al. [4, 5]
Formally introduce the search task discovery problem.
Graph-based representation of each user session:
Nodes are queries.
Edges between query pairs are weighted according to a query
similarity measure.
Search tasks are identified by connected components of
each user session graph.
Outperforms other approaches for session boundary detection
like the Query-Flow Graph [1].
6. Learning to Rank User Queries to Detect Search Tasks
Related Work
Related Work - III
Wang et al. [6]
Cross-session discovery of search tasks.
Graph-based representation of all queries.
Search tasks as connected components of the graph having
the following characteristics:
Each query of a task can be linked only to one past query of
the same task.
Tasks are therefore modeled as trees.
SVM model to identify the best tree structure hidden by the
query similarity graph.
7. Learning to Rank User Queries to Detect Search Tasks
Search Task Discovery (STD) Framework
Search Task Discovery Framework
Ground truth
of search tasks
QSL
1. Learning
query similarity
2. Learning
pruning threshold
3. Find
connected components
GQC
8. Learning to Rank User Queries to Detect Search Tasks
Search Task Discovery (STD) Framework
Query Similarity Learning (QSL)
Query Similarity Learning
Query Similarity Learning (QSL): estimates a target query
similarity function ˆσ from a ground truth of manually-labeled
search tasks.
Binary classes: same-task (positive) and not-same-task
(negative)
Learning to Rank: instead of predicting same-task, we
learn a ranking function:
same-task queries ranked highest.
How to build the training sets?
9. Learning to Rank User Queries to Detect Search Tasks
Search Task Discovery (STD) Framework
Graph-based Query Clustering (GQC)
Graph-based Query Clustering
Graph-based Query Clustering (GQC): transforms a user
search log into a weighted query graph Gu
ˆσ:
Queries are nodes.
Edges are labeled using ˆσ previously learnt by QSL.
Connected components of Gu
ˆσ: user search tasks.
Weak edges introduced by ˆσ:
-neighborhood technique as selective pruning: removing edges
if below a given .
optimal ˆ learnt from groud truth.
10. Learning to Rank User Queries to Detect Search Tasks
Search Task Discovery (STD) Framework
Search Task Discovery Problem
Search Task Discovery Problem
Given a user query log Qu, a clustering algorithm C that extracts
the connected components of the graph, and a quality function γ
measuring the quality of a clustering, the Search Task Discovery
Problem requires to find the best similarity function and
pruning threshold that maximize the average quality of the clusters
C(Gu
ˆσ,ˆ) for all u ∈ U, i.e.,
(ˆσ∗
, ˆ∗
) = argmaxˆσ,ˆ
1
|U|
u∈U
γ(C(Gu
ˆσ,ˆ)).
11. Learning to Rank User Queries to Detect Search Tasks
Reducing QSL to a Learning to Rank Problem
Reducing QSL to a Learning to Rank Problem
Query-centric approach: we aim at learning a query
similarity function that scores higher those queries that
appear in the same task.
For any given qu
i ∈ Tu
k , we say that qu
j is relevant to qu
i if
qu
j ∈ Tu
k , and irrelevant otherwise.
Labels {1, 0} assigned accordingly when building the training
set.
Number of relevant labels: u |Qu|2.
12. Learning to Rank User Queries to Detect Search Tasks
Reducing QSL to a Learning to Rank Problem
Reducing QSL to a Learning to Rank Problem
User-centric approach: we aim at learning a query similarity
function that scores higher every pair of objects in the same
task.
Given any pair of queries (qu
i , qu
j ) in the user search log Qu we
require their similarity to be high iff they belong to the same
task.
Here, each (qu
i , qu
j ) is a single ordered pair, where i ≤ j is
associated with the tuple for user u.
the binary relation “≤” between queries is given by the order
of their issuing times.
Number of relevance labels: u
Qu
2 .
13. Learning to Rank User Queries to Detect Search Tasks
Reducing QSL to a Learning to Rank Problem
Types of Features Used1
Symmetric global features based on Qu.
Examples: Session Num Queries, Session Time Span,
Avg Session Query Len, etc.
Symmetric features extracted from the query pair (qu
i , qu
j ).
Examples: Levenshtein, Jaccard (3-grams), ∆ Time, ∆ Pos,
Global Joint Prob (queries), Wikipedia Cosine [5].
Asymmetric features extracted from the query pair (qu
i , qu
j ).
Examples: Is Proper Subset (term-set),
Global Conditional Prob (queries).
37 features in total.
1
See our paper for the complete list of features used.
14. Learning to Rank User Queries to Detect Search Tasks
Implementing STD
Implementing STD
Learning to Rank
Gradient-Boosted Regression Trees (GBRT)
RMSE.
LambdaMART (λMART)
nDCG.
Binary Classification
Logistic Regression (LogReg) [3]
Logistic loss.
Decision Trees (DT) [5]
Information gain.
k-fold cross validation (k = 5): parameters tuned on
validation data.
Clustering measure to get ˆ: Jaccard.
15. Learning to Rank User Queries to Detect Search Tasks
Experiments
Dataset
Dataset
Proposed by Hagen et al. [2]
Three-months sample of the AOL query log.
8,840 queries issued by 127 users.
Labeled by two human assessors into 1,378 user search tasks
(called missions in the original paper).
We remove stopwords, noisy chars and transform query strings
to lowercase.
We remove longest and shortest user sessions.
Resulting dataset: 6,381 queries from 125 users.
16. Learning to Rank User Queries to Detect Search Tasks
Experiments
Dataset
Dataset
0
0.2
0.4
0.6
0.8
1
1 2 3 4 5 10 15 20 25 30 35 55 85 110 166
ratiooftasks
number of queries
(a)
0
0.2
0.4
0.6
0.8
1
1 2 3
ratioofusers
0.8
1
rs
0.8
1
rs
Singleton tasks are 41% of the dataset.
Singleton: answering always not-same-task.
17. Learning to Rank User Queries to Detect Search Tasks
Experiments
Results
Results
Query-centric approach
Figure 2: Pro
Table 5: Comparison of L2R techniques and baselines in
validation folds. Best results are shown in boldface, and
↵ = .05.
(a) Query-centric dataset L0
Method
Metric
Rand F1avg Jaccard F1
Singleton 0.738 0.458 0 0
DT 0.898 0.853 0.620 0.714
LogReg 0.919 0.868 0.639 0.737
GBRT 0.915 0.889(*) 0.670 0.763
MART 0.919 0.879 0.687(*) 0.778(*)
skewness of class labels distribution in our dataset. There
fore, even though it is a widely used measure, Rand inde
18. Learning to Rank User Queries to Detect Search Tasks
Experiments
Results
Results
User-centric approach
e 2: Properties of the dataset.
elines in terms of Rand, F1avg, Jaccard, and F1 averaged across 5
ace, and there is a (*) next to those which are statistically signifi
F1
0
0.714
0.737
0.763
.778(*)
(b) User-centric dataset L00
Method
Metric
Rand F1avg Jaccard F1
Singleton 0.738 0.458 0 0
DT 0.880 0.843 0.604 0.706
LogReg 0.921(*) 0.868 0.639 0.738
GBRT 0.913 0.875(*) 0.682 0.771
MART 0.914 0.873 0.684(*) 0.778(*)
. There-
nd index
e quality
criminative features are those regarding the relative
times and positions of a given pair of queries. This
sonable, since the chance of two close queries to be
19. Learning to Rank User Queries to Detect Search Tasks
Conclusion
Conclusion
We proposed the Search Task Discovery framework made up
of two modules: QSL and GQC.
QSL learns a query similarity functions from a ground truth of
manually-labeled search tasks.
GQC models the user queries as a graph.
We propose to employ Learning to Rank techniques (GBRT,
λMART) in QSL.
Experiments prove the effectiveness of Learning to Rank
techniques in detecting search tasks.
Future Work: We plan to employ STD in a streaming setting to
detect search tasks in (pseudo) real time.
20. Learning to Rank User Queries to Detect Search Tasks
Conclusion
Thank you for your attention!
21. Learning to Rank User Queries to Detect Search Tasks
Conclusion
References I
[1] P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, and S. Vigna.
The query-flow graph: model and applications.
In CIKM’08, pages 609–618. ACM, 2008.
[2] M. Hagen, J. Gomoll, A. Beyer, and B. Stein.
From search session detection to search mission detection.
In OAIR’13, pages 85–92, 2013.
[3] R. Jones and K. L. Klinkner.
Beyond the session timeout: automatic hierarchical segmentation of search topics
in query logs.
In CIKM’08. ACM, 2008.
[4] C. Lucchese, S. Orlando, R. Perego, F. Silvestri, and G. Tolomei.
Identifying task-based sessions in search engine query logs.
In WSDM’11, pages 277–286. ACM.
[5] C. Lucchese, S. Orlando, R. Perego, F. Silvestri, and G. Tolomei.
Discovering user tasks in long-term web search engine logs.
ACM TOIS, 31(3):1–43, July 2013.
22. Learning to Rank User Queries to Detect Search Tasks
Conclusion
References II
[6] H. Wang, Y. Song, M.-W. Chang, X. He, R. W. White, and W. Chu.
Learning to extract cross-session search tasks.
In WWW’13, pages 1353–1364. ACM, 2013.
23. Learning to Rank User Queries to Detect Search Tasks
Conclusion
Clustering Metrics
Rand: tp+tn
tp+tn+fp+fn
Jaccard: tp
tp+fp+fn
F1: 2·p·r
p+r .
F1avg = j
mj
m F1max(j) where mj = |Tu
j | and m = |Qu|.
24. Learning to Rank User Queries to Detect Search Tasks
Conclusion
Feature Importance