These slides refer to the talk I gave at the last International Conference on Open Research Areas in Information Retrieval (OAIR 2013), where I presented a research paper entitled "Modeling and Predicting the Task-by-Task Behavior of Search Engine Users".
Web search engines answer user needs on a query-by-query fashion, namely they retrieve the set of the most relevant results to each issued query, independently. However, users often submit queries to perform multiple, related tasks.
In this work, we first discuss a methodology to discover from query logs the latent tasks performed by users. Furthermore, we introduce the Task Relation Graph (TRG) as a represen- tation of users’ search behaviors on a task-by-task perspective. The task-by-task behavior is captured by weighting the edges of TRG with a relatedness score computed between pairs of tasks, as mined from the query log. We validate our approach on a concrete application, namely a task recommender system, which suggests related tasks to users on the basis of the task predictions derived from the TRG. Finally, we show that the task recommendations generated by our solution are beyond the reach of existing query suggestion schemes, and that our method recommends tasks that user will likely perform in the near future.
4. A New Way of Search
May, 23 2013 - Lisbon, Portugal
Alice
Bob
Same Task!
“Reserving a hotel room in New York”
4
5. … and Search Engines?
• Roughly, they are still Web document
retrieval tools
– answering on a per-query basis
– ten-blue links to relevant Web pages
5
May, 23 2013 - Lisbon, Portugal
6. Information Need Hierarchy
• Web Task: any (atomic) activity that a user
performs through Web search
– “find a recipe”, “book a flight”, “read news”,
etc.
– distinct users may use different queries to
accomplish the same Web task
• Web Mission: composition of Web tasks to
achieve complex goals
– distinct users may use different Web tasks to
accomplish the same Web mission
6
May, 23 2013 - Lisbon, Portugal
[Jones and Klinkner, CIKM „08]
7. Goals
• Mine Search Engine logs to detect Web
tasks
• Provide a user model for task-oriented
search
– from query-by-query to task-by-task
• Show how such model can be used to
design a real-world application
– from query to task recommendation
7
May, 23 2013 - Lisbon, Portugal
9. The Big Picture
• Bottom-up, 2-stage clustering solution:
– User Task Discovery from “raw” queries
issued by the same user and stored in query
logs
– Collective Task Discovery from distinct User
Tasks
• Graph-based representation of Collective 9
May, 23 2013 - Lisbon, Portugal
10. User Task Discovery
• User Task
– set of possibly non contiguous queries (multi-
tasking), issued by a single user, whose aim
is to carry out a specific Web task
• QC-HTC
– Graph-based query clustering solution
proposed in our previous work [Lucchese et al.,
WSDM‟11]
– outperforms other techniques for session
boundary detection in query logs (e.g., QFG
[Boldi et al., CIKM‟08])
10
May, 23 2013 - Lisbon, Portugal
11. User Task Discovery: QC-HTC
• Splits long-term user session into shorter time-
based sessions
• Builds a weighted undirected graph for each time-
based session
– nodes in each graph are the queries of a time-based
session
• Weight-links consecutive pairs of queries with their
content-based similarity:
– lexical (query character n-grams)
– semantic (query “wikification”)
• Merges any two sequential clusters if their first
(head) and last (tail) queries are similar enough
11
May, 23 2013 - Lisbon, Portugal
13. Collective Task Discovery
• Collective Task
– group of distinct user tasks (i.e., distinct sets of
queries performed by several users) to represent
the same Web task
• Identify similar user tasks by clustering their
“bag of words” representations
– Each user query is a sentence
– Each user task is a concatenation of possibly
many sentences (i.e., a text document)
• T = {T1, …, TK} is the final set of Collective
Tasks
13
May, 23 2013 - Lisbon, Portugal
14. Mapping User to Collective
Tasks
… … … …
14
May, 23 2013 - Lisbon, Portugal
15. Task Relation Graph (TRG)
• Task-oriented model of user search behavior
• TRG(T, E, w, η) is a weighted directed graph
– nodes are the set of collective tasks T={T1, …, TK}
– edges E represent task relatedness
– w: TxT [0,1] is the weighting-edge function
– ηis a weight threshold
• Ti and Tj are linked together iff w(Ti, Tj) > η
15
May, 23 2013 - Lisbon, Portugal
23. Clustering User Tasks
• Algorithm: Repeated Bisections vs.
Agglomerative
• Similarity Measure: Cosine similarity vs.
Pearson‟s correlation
• Objective Function: maximize intra-cluster
similarity
• Stop Criterion: choose heuristically the final
number K of clusters through the “elbow
method”
• We select K = 1,024
23
May, 23 2013 - Lisbon, Portugal
24. Results and Example
Results were evaluated on a manually-built ground-truth of collective tasks
[Lucchese et al., TOIS 2013]
24
May, 23 2013 - Lisbon, Portugal
26. Building TRG: Task Relatedness
• Use the training set to compute w(Ti,Tj)
• Frequent Sequential Patterns
– η= support (i.e., probability) of Ti and Tj co-
occurring in a specified sequence: P(<Ti, Tj>)
– task order matters!
• Association Rules Ti Tj
– η= support: P({Ti, Tj})
– η= confidence: P(Tj|Ti)
– task order doesn‟t matter!
26
May, 23 2013 - Lisbon, Portugal
27. Task Recommendation
• One out of many possible applications of
TRG
• A user is performing (or has just
performed) a task Ti
– indeed a user task which is similar to a known
Ti
• Retrieve from TRG the set Rm(Ti) including
the m-top related nodes/tasks to Ti
– tasks in Rm(Ti) are those having the m highest
edge weights among all the adjacent nodes to27
May, 23 2013 - Lisbon, Portugal
28. Task Recommendation:
Experiments
• Use TRGs built from training set to
generate task recommendations for the
test set
• Original user sessions in test set are split
in 1/3 prefix and 2/3 suffix sets of user
tasks
• Each user task is mapped to a candidate
collective task Tc (cosine similarity)
• From all the Tc in prefix retrieve the union-
set of recommendations U R (T ) from
28
May, 23 2013 - Lisbon, Portugal
34. Task vs. Query
Recommendation
• To show that task recommendation is
different from well-known query
recommendation
• TRG vs. QFG
– 83.8% of top-3 query suggestions generated by
QFG live in the same (collective) task
– Only 15.1% of top-3 query suggestions generated
by QFG lead to 2 separate (collective) tasks
• QFG is great if user wants to stay in the
same task
• TRG allows user to switch and jump to other
tasks
34
May, 23 2013 - Lisbon, Portugal
36. The “Take-Away” Message
• Web Search Engines should handle user
requests from “query-by-query” to “task-
by-task”
• New models for user search behavior are
needed: from Query Flow Graph to Task
Relation Graph
• Task Relation Graph may be exploited for
several applications (e.g., Task
Recommendation)
36
May, 23 2013 - Lisbon, Portugal
37. Future Work
• Advanced Task Representation
– E.g., linked data, as opposed to simple bag-of-
queries
• Automatic Task Labeling (taxonomy of Web
tasks):
– Linking queries of collective tasks with referent
entities in a knowledge base
– Exploit entity categories to label the whole task
• Use TRG for other applications
– Task-based advertising, Mission discovery, etc.
• New SERP to render task-oriented results
37
May, 23 2013 - Lisbon, Portugal