Salvatore Orlando "Mining query logs to improve web search engines' operations"

3,458 views
3,360 views

Published on

Published in: Technology, Health & Medicine
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
3,458
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Salvatore Orlando "Mining query logs to improve web search engines' operations"

  1. 1. Beyond Query Suggestions: Recommending Tasks to SE Users Claudio Lucchese*, Gabriele Tolomei*+, Salvatore Orlando+, Raffaele Perego*, Fabrizio Silvestri* *ISTI - CNR, Pisa, Italy +Università Ca’ Foscari Venezia, Italy TechTalk at Moscow - August 22, 2011 C. Lucchese, S. Orlando, R. Perego, F. Silvestri, G. Tolomei. Identifying Task-based Sessions in Search Engine Query Logs. ACM WSDM, Hong Kong, February 9-12, 2011 G. Tolomei, S. Orlando, and F. Silvestri. Towards a task-based search and recommender systems. The 26th IEEE International Conference on Data Engineering, ICDE 10 Workshops, pages 333-336, 2010. C. Lucchese, S. Orlando, R. Perego, F. Silvestri, G. Tolomei. Beyond Query Suggestions: Recommending Tasks to Search Engine Users, submitted to WSDM 2012.mercoledì 24 agosto 2011 1
  2. 2. Contents • Introduction and Motivations • Web Task Discovery • Web Task Recommendation • Mining Query Logs for Web Task Discovery • Mining Query Logs for Web Task Recommendationmercoledì 24 agosto 2011 2
  3. 3. Web Task Discovery • What is a Web task? • Any (atomic) user activity that can be achieved by exploiting the information/services available on the Web, e.g., “find a recipe”, “book a flight”, “read news”, etc. • Why do we use WSE Query Logs to identify Web tasks? • “Addiction to Web search” : Users rely on WSEs for satisfying their information needs by issuing possibly interleaved stream of related queries • How do we discover Web tasks in query logs? • Task-based Session Discovery • sessions are sets of possibly non contiguous queries, issued by a user whose aim is to carry out specific “Web tasks”mercoledì 24 agosto 2011 3
  4. 4. Web Task Discovery • What is a Web task? • Any (atomic) user activity that can be achieved by exploiting the information/services available on the Web, e.g., “find a recipe”, “book a flight”, “read news”, etc. • Why do we use WSE Query Logs to identify Web tasks? • “Addiction to Web search” : Users rely on WSEs for satisfying their information needs by issuing possibly interleaved stream of related queries • How do we discover Web tasks in query logs? we argue that there is task interleaving • Task-based Session Discovery • sessions are sets of possibly non contiguous queries, issued by a user whose aim is to carry out specific “Web tasks”mercoledì 24 agosto 2011 3
  5. 5. The Big Picture of Task Discoverymercoledì 24 agosto 2011 4
  6. 6. The Big Picture of Task Discovery query ... hong kong flightsmercoledì 24 agosto 2011 4
  7. 7. The Big Picture of Task Discovery query ... fly to hong kongmercoledì 24 agosto 2011 4
  8. 8. The Big Picture of Task Discovery query ... nba sport newsmercoledì 24 agosto 2011 4
  9. 9. The Big Picture of Task Discovery query ... pisa to hong kongmercoledì 24 agosto 2011 4
  10. 10. The Big Picture of Task Discovery long-term session ... ... ...mercoledì 24 agosto 2011 4
  11. 11. The Big Picture of Task Discovery Δt > tφ long-term session ... ... ... ... 1 2 nmercoledì 24 agosto 2011 4
  12. 12. The Big Picture of Task Discovery 1 2 ... nmercoledì 24 agosto 2011 4
  13. 13. The Big Picture of Task Discovery 1 2 ... n fly to nba news shopping in Hong Kong Hong Kongmercoledì 24 agosto 2011 4
  14. 14. Task/Query Recommendation How to exploit Task-based Session Knowledge • We are interested in supplying novel recommendations to WSE users, on the basis of the knowledge of the task-based query sessions • from intra-task query recommendation • to task recommendation • to finally arrive at inter-task query recommendationmercoledì 24 agosto 2011 5
  15. 15. Task/Query Recommendation How to exploit Task-based Session Knowledge • We are interested in supplying novel recommendations to WSE users, on the basis of the knowledge of the task-based query sessions • from intra-task query recommendation • to task recommendation • to finally arrive at inter-task query recommendation We aim to recommend another related task, where the current and the recommended ones are part of a missionmercoledì 24 agosto 2011 5
  16. 16. Task/Query Recommendation From tasks to missions • From Web task to Web mission • We argue that single search tasks may be subsumed by composite tasks, namely missions, which user aim to accomplish through a WSE • Example • Alice starts interacting with her favorite WSE by submitting queries we recognize as part of the same task, related to “booking of a hotel room in New York” • The current Alices task could be included in a mission, composed of more tasks, concerned with “planning a travel to New York” R. Jones and K.L. Klinkner. 2008. Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In CIKM ’08. ACM, 699–708.mercoledì 24 agosto 2011 6
  17. 17. Task/Query Recommendation From intra- to inter-task query suggestions • Assume we recognize that Alice is performing a query task (several queries) related to “booking of a hotel room in New York” • query suggestion mechanisms may provide alternative related queries (intra-task query suggestions), such as “cheap new york hotels”, “times square hotel”, “waldorf astoria”, etc. ⇐ this is similar to current query suggestion mechanisms • Assume we discover that many users performed a task similar to the Alices one, along with other tasks ➠ they could be part of a composite mission • e.g., a mission that is concerned with “planning a travel to New York” • we may also recommend to Alice other tasks, whose underpinning queries look like: “mta subway”, or “broadway shows”, or “JFK airport shuttle” (inter-task query suggestions)mercoledì 24 agosto 2011 7
  18. 18. Task Discoverymercoledì 24 agosto 2011 8
  19. 19. Data Set: AOL Query Log ✓ 3-months collection Original Data Set ✓ ~20M queries ✓ ~657K users ✓ 1-week collection ✓ ~100K queries ✓ 1,000 users ✓ removed empty queries ✓ removed “non-sense” queries Sample Data Set ✓ removed stop-words ✓ applied Porter stemming algorithmmercoledì 24 agosto 2011 9
  20. 20. Data Analysis: query time gap 84.1% of adjacent query pairs are issued within 26 minutes tφ = 26 min.mercoledì 24 agosto 2011 10
  21. 21. Task-based Session Discovery a combined approach 1) TimeSplitting-t 2) QueryClustering-m The idea is that if two consecutive queries Queries in a time-based sessions are are far away, they are also likely unrelated f u r t h e r g ro u p e d u s i n g c l u s t e r i n g algorithms, which exploit several query TS-t, where t is the time gap between features. consecutive queries must be not greater Two queries (qi, qj) are in the same task- than t based session iff they belong to the same cluster. This technique alone produces time-based sessions, but is unable to deal with multi- QC-m, where m is one of the clustering tasking method Methods: QC-MEANS, QC-SCAN, QC-WCC, and Methods: TS-5, TS-15, TS-26, etc. QC-HTCmercoledì 24 agosto 2011 11
  22. 22. QC-m: Query Features and Similarities Content-based (µcontent) Semantic-based (µsemantic) ✓ two queries (qi, qj) sharing common terms are ✓ using Wikipedia and Wiktionary for likely related “expanding” a query q ✓ µjaccard: Jaccard index on query character 3- ✓ “wikification” of q using vector-space model grams ✓ relatedness between (qi, qj) computed using ✓ µlevenshtein: normalized Levenshtein (edit) cosine-similarity distancemercoledì 24 agosto 2011 12
  23. 23. Distance Functions: µ1 vs. µ2 ✓Convex combination µ1 ✓ Conditional formula µ2 Idea: if two queries are close in term of lexical content, the semantic expansion could be unhelpful. Vice-versa, nothing can be said when queries do not share any content feature ✓ Both µ1 and µ2 rely on the estimation of some parameters, i.e., α, t, and b ✓ Use the ground-truth for tuning parametersmercoledì 24 agosto 2011 13
  24. 24. Ground-truth: construction e use • ~500 long-term sessions are first split using the threshold tφ devised before (i.e., 26 minutes), obtaining several time-gap sessions • Human annotators put together task-related queries that they claim to be part of the same task • This ground-truth is used for • to tune some parameter of our method • more importantly, to evaluate the automatic clustering for task-based session discoverymercoledì 24 agosto 2011 14
  25. 25. Ground-truth: statistics ✓ 4.49 avg. queries per time-gap session ✓ more than 70% time-gap session contains at most 5 queries ✓ 1.80 tasks per time-gap session (avg.) ✓ 2.57 avg. queries per task-based ✓ ~47% time-gap session contains more sessions than one task (multi-tasking) ✓ ~75% tasks contains at most 3 ✓ 1,046 over 1,424 queries (i.e., ~74%) queries included in multi-tasking sessionsmercoledì 24 agosto 2011 15
  26. 26. QC-WCC WEIGHTED CONNECTED COMPONENTS time-gap session φ 1 2 3 4 5 6 7 8mercoledì 24 agosto 2011 16
  27. 27. QC-WCC WEIGHTED CONNECTED COMPONENTS time-gap session φ 1 2 3 4 5 6 7 8 1 2 Build the similarity graph Gφ - O(N2) 8 3 7 4 6 5mercoledì 24 agosto 2011 16
  28. 28. QC-WCC WEIGHTED CONNECTED COMPONENTS time-gap session φ 1 2 3 4 5 6 7 8 1 2 8 3 Drop “weak edges” 7 4 6 5mercoledì 24 agosto 2011 16
  29. 29. QC-WCC WEIGHTED CONNECTED COMPONENTS time-gap session φ 1 2 3 4 5 6 7 8 1 2 8 3 7 4 6 5mercoledì 24 agosto 2011 16
  30. 30. QC-HTC HEAD-TAIL COMPONENTS • Variation of QC-WCC, that does not need to compute the full similarity graph • Performs into 3 steps: 1. computes the similarities between chronologically consecutive queries in φ and insert the corresponding labeled edge in the graph Gφ - O(N) 2. Drop “weak edges”, thus obtaining clusters corresponding to sub-sequences • each cluster represented by the chronologically first and last queries: head and tail 3. Pairwise merging of the sub-sequences so obtained, • by only inserting edges between the heads/tails if the corresponding queries are similar enoughmercoledì 24 agosto 2011 17
  31. 31. Experiments Setup • Run and compare all the proposed approaches with: • TS-26: time-splitting technique (baseline) • QFG: session extraction method based on the query-flow graph model (state of the art) P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, S.Vigna: The query-flow graph: model and applications. CIKM 2008: 609-618mercoledì 24 agosto 2011 18
  32. 32. Evaluation • Measure the degree of correspondence between the true tasks (manually-extracted ground-truth), and the predicted tasks (algorithms’ outputs) • we can use all the external measures commonly used to evaluate a clustering algorithm when used to extract disjoint partitions in a dataset annotated with class labels a) F-MEASURE b) RAND c) JACCARD ✓ evaluates the extent to ✓ pairs of queries instead ✓ pairs of queries instead which a predicted task of singleton of singleton contains only and all the ✓ f00, f01, f10, f11 ✓ f01, f10, f11 queries of a true task ✓ combines p(i, j) and r(i, j) the precision and recall of task i w.r.t. class jmercoledì 24 agosto 2011 19
  33. 33. Evaluation • =Measureof objʼs w/ different class and task the true tasks (manually-extracted f00 #pairs the degree of correspondence between ground-truth), and the predicted tasks (algorithms’ outputs) f01 = #pairs of objʼs w/ different class and same task • we can objʼs w/ external measures different used f10 = #pairs ofuse all thesame class and commonlytask to evaluate a clustering algorithm f11 = #pairs of objʼs extract disjoint partitions in a dataset annotated with class labels when used to w/ same class and task a) F-MEASURE b) RAND c) JACCARD ✓ evaluates the extent to ✓ pairs of queries instead ✓ pairs of queries instead which a predicted task of singleton of singleton contains only and all the ✓ f00, f01, f10, f11 ✓ f01, f10, f11 queries of a true task ✓ combines p(i, j) and r(i, j) the precision and recall of task i w.r.t. class jmercoledì 24 agosto 2011 20
  34. 34. Results: best Supervised method Query Flow Graph (QFG) K-means (centroid-based) and DB-Scan (density- based) P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, S.Vigna: The query-flow graph: model and applications. CIKM 2008: 609-618mercoledì 24 agosto 2011 21
  35. 35. Results: best Supervised method Query Flow Graph (QFG) K-means (centroid-based) and DB-Scan (density- based) P. Boldi, F. Bonchi, C. Castillo, D. Donato, A. Gionis, S.Vigna: The query-flow graph: model and applications. CIKM 2008: 609-618mercoledì 24 agosto 2011 21
  36. 36. Task Recommendationmercoledì 24 agosto 2011 22
  37. 37. Crowd-based Task Synthesis User 1 All the tasks appear User 2 different User 3 .... • We need to find recurrent behaviors in the crowd • We need to identify similar tasks by clustering the associated “virtual documents” • where each virtual document include the queries performed in the task by a usermercoledì 24 agosto 2011 23
  38. 38. Crowd-based Task Synthesis same Th • We can rewrite each task-oriented User 1 session in terms of the new User 2 synthesized tasks ids: Th where Th = {T1 , ... , TK} User 3 .... • Th can be considered as a representative for an aggregation composed of similar tasks, performed by several distinct users • The various long term sessions thus become sets/sequences of synthesized tasks in which we can find recurrencesmercoledì 24 agosto 2011 24
  39. 39. Task-based Model Generation • Produce a Task Recommendation Model • a weighted directed graph GT = (T, E, w), where the weighting function w(.) measures the “inter-task relatedness” • if they are related, they are probably part of the same mission wh,i wk,i wi,j GT = (T, E, w)mercoledì 24 agosto 2011 25
  40. 40. Task-based Recommendation • Generate a Task-oriented Recommendations • given a user who is interested in (has just performed) a task Ti • retrieve from GT the set Rm(Ti), which includes the m-top related nodes/ tasks to Ti • the graph nodes in Rm(Ti) are directly connected to node Ti and are the m ones labeled with the highest weights Timercoledì 24 agosto 2011 26
  41. 41. Task-based Recommendation • Generate a Task-oriented Recommendations • given a user who is interested in (has just performed) a task Ti • retrieve from GT the set Rm(Ti), which includes the m-top related nodes/ tasks to Ti • the graph nodes in Rm(Ti) are directly connected to node Ti and are the m ones labeled with the highest weights R2(Ti) Timercoledì 24 agosto 2011 26
  42. 42. How to Generate the Model • Various methods to generate edges in GT and the associated weights wh,i wk,i • Random-based (baseline): an edge for each pair, whose weights are uniform wi,j GT = (T, E, w) • Sequence-based: the frequency of the pairs wrt a given support threshold, by considering the relative order in the original sequences • Association-Rule based (support): the frequency of the rule wrt a given support threshold. We do not consider the relative order in the original sequences to extract the rules • Association-Rule based (confidence): the confidence of the rules wrt a given confidence threshold. We do not consider the relative order in the original sequences to extract the rulesmercoledì 24 agosto 2011 27
  43. 43. Data Set: AOL 2006 Query Log ✓ 3-months collection ✓ ~20M queries Original Data Set ✓ ~657K users ✓ Top-600 longest user sessions ✓ ~58K queries ✓ avg 14 queries per user/day ✓set A : 500 user sessions (training) ✓ set B : 100 user sessions (test) Sample Data Setmercoledì 24 agosto 2011 28
  44. 44. Data Set: AOL 2006 Query Log ✓ 3-months collection ✓ ~20M queries Original Data Set ✓ ~657K users ✓ Top-600 longest user sessions ✓ ~58K queries ✓ avg 14 queries per user/day ✓set A : 500 user sessions (training) ✓ set B : 100 user sessions (test) Sample Data Setmercoledì 24 agosto 2011 28
  45. 45. Data Set: AOL 2006 Query Log ✓ 3-months collection ✓ ~20M queries Original Data Set ✓ ~657K users ✓ Top-600 longest user sessions ✓ ~58K queries ✓ avg 14 queries per user/day ✓set A : 500 user sessions (training) ✓ set B : 100 user sessions (test) Sample Data Set From both A and B we extracted the tasks with our QC-HTC Set A was used to generate the graph model using various mining techniquesmercoledì 24 agosto 2011 28
  46. 46. Experimental results • We used the log subset B (test query log) for evaluation wh,i wk,i • we divided each long term session in B (with wi,j synthesized tasks) into a 1/3 prefix and 2/3 suffix GT = (T, E, w) • the prefix is used to retrieve from GT the tasks: Rm(prefix) • We measured • precision (proportion of suggestions that actually occur in the 2/3 suffix) • coverage (proportion of tasks in the 1/3 prefix that are able to provide at least one suggestion) • changing the weighting in each model, by tuning the corresponding parameters, modifies the coverage ... • we thus plot precision vs coverage to permit the different models to be fairly comparedmercoledì 24 agosto 2011 29
  47. 47. Experimental results Recommendation Modelsmercoledì 24 agosto 2011 30
  48. 48. Experimental results Recommendation Modelsmercoledì 24 agosto 2011 31
  49. 49. Anecdotal Evidence actually } performed queriesmercoledì 24 agosto 2011 32
  50. 50. Anecdotal Evidencemercoledì 24 agosto 2011 33
  51. 51. Questions?mercoledì 24 agosto 2011 34

×