Query Recommendation
Search & Recommendation
Puya - Hossein Vahabi
Barcelona 2017
Introduction
Search & Recommendation
3
Search Engine Architecture
4
Spelling Correction
Query Expansion + Recommendation
Query
Document
Retrieval
Ranking of the results
Query logs
5
Query Recommendation
6
Search Engine Architecture
7
Spelling Correction
Query Expansion + Recommendation
Query
Document
Retrieval
Ranking of the results
Problem definition
Problem Definition
9
◻ Given a query q, we want to find a set of related
queries q_1, q_2, …, q_k and be able to rank
them.
Query Suggestion SOTA
10
◻ Query-Flow Graph and Term-Query Graph
[Bonci et al. 2008, Vahabi et al. 2012]
Robust to long-tail queries but computationally
complex
◻ Context-awareness by VMM models [He et al.
2009, Cao et al. 2008]
Sparsity issues and not robust to long-tail
queries
Query Suggestion SOTA
11
◻ Learning to rank by featurizing query context
[Shokhoui et al. 2013, Ozertem et al. 2012]
Order of queries / words in the queries is often
lost
◻ Synthetic queries by template-based
approaches [Szpektor et al. 2011, Jain et al.
2012]
Query Flow Graph
12
13
◻ Rare and never-seen queries account for more
than 50% of the traffic!
Rare and never-seen queries
Queries ordered by popularity
Popularity
Efficient Query Recommendations
in the Long Tail via Center-Piece
Subgraphs
SIGIR 2012
TQ-Graph: term centric
15
99% of Coverage
1° Step: RWR
16
Query: lower
heart rate
1° Step: RWR
17
Query: lower
heart rate
1° Step: RWR
18
Query: lower
heart rate
Query suggestion via center-piece
19
◻ The importance of a query j w.r.t. a set U of terms is
eventually given by the product of scores of j for
each node in U
RWR:
I = > 0.90
J = > 0.10
K => 0.00
I,J,K => 0.33
Center-piece:
I = > 0.90
J = > 0.10
K => 0.00
I,J,K => 0.00
TQ-Graph vs RWR suggestions
20
TQ-Graph suggestions RWR suggestions
Things to lower heart rate Broken heart
Lower heart rate through exercise Prime rate
Accelerated heart rate and
pregnant
Exchange rate
Web md Bank rate
Heart problems Currency exchange rates
Unseen query: lower heart rate
Solution problem: computational
cost21
◻ Ω ( m * (|E| + |Q|) )
⬜m: number of query terms
⬜E: number of graph edges
⬜Q: number of queries
Speeding-up subgraph extraction
22
Effectiveness after Pruning&bucketing
23
The probability of
the suggested
queries that have
been replaced are
similar to the
query suggested
originally.
Scalability: cache miss (%)
24
With 8GB of main
memory we have a
cache miss rate
< 10%.
Orthogonal Query
Recommendation
RecSys 2013
What is the problem?
2
6
Query Recommendation
If a user is not satisfied with the
answers, a query recommendation
can be useful.
Problem
27
Query Recommendation
Looking for FAA
(Federal Aviation
Administration)
Problem: which keyword to use?
28
◻ Sometimes users’ know what they are looking for,
but they don’t know which keywords to use:
⬜“Daisy Duke” but looking for “Catherine Bach”
⬜“diet supplement” but looking for “body building
supplements”
◻Traditional query recommendation algorithms fails:
Why?
⬜Because they are looking for highly related queries
Definition: orthogonal queries
29
◻ Orthogonal queries are related queries that have
(almost) no common terms with the user's query.
Comparison
30
CG Cover Graph
UF-IQF User Frequency-Inverse Query
Frequency
OQ Orthogonal Query
Recommendation
SC Short Cut
QFG Query Flow Graph
TQG Term-Query Graph
SQ Similar Query
Example
31
Results: User Study on top-5
Recommendations
32
In 45% of cases OQ is
judged to be useful or
somewhat useful.
OQ is the best for
queries in the
long-tail.
Successful results overlap:S@10
33
OQ SUCCEEDS WHERE OTHERS FAIL!
A Hierarchical Recurrent Encoder-
Decoder for Context-Aware
Generative Query Suggestion
CIKM 2015
Generative Model
35
◻Generative - i.e. being able of producing synthetic
suggestions that may not exist in the training data.
◻Other Key Properties :
1) robust in the long-tail - word-based approach
2) context-aware - can use an unlimited number of
previous queries
Word and query embedding
36
◻Learn vector representations for words and queries
encoding their syntactic and semantic
characteristics.
⬜“Similar” queries associated to “near” vectors.
“game”
[ 0.1, 0.05, -0.3, … , 1.1 ] [ 0.35, 0.15, -0.12, … , 1.3 ]
“cartoon network
game”
Word and query embedding
37
word
space
query
space
Hierarchical Recurrent Encoder
Decoder (HRED)
38
P(lake)
lake
P(erie)
erie art
P(art) P(</q>)
cleveland
P(indian)
indian art
P(art) P(</q>)P(cleveland)
lake erie art </q>cleveland gallery </q>
cleveland gallery → lake erie art → cleveland indian art
Session-level
recurrent states
summarize past
session context.
Recurrent Neural Networks (RNNs)
39
Word Embeddings
◻ The weight matrices W and U are fixed throughout
the timesteps.
W
U
W
U
cleveland gallery </q>Input Query
W
U
Recurrent states
Initialization, 0
vector
RNN encoder
40
◻Aggregates word embeddings
◻The last recurrent state is used as the query
embedding.
◻The query embedding is sensitive to the order of
words in the query!
Query embedding
cleveland gallery </q>Input Query
RNN Decoder
41
◻Probabilistic mapping from query embeddings to
textual queries P(Q|x).
O
Input query
embedding lake
O
erie art
O O
RNN Encoder and RNN Decoder
42
cleveland gallery </q>Input Query
P(lake)
lake
P(erie)
erie art
P(art) P(</q>)
Input query
embedding
Output query
embedding
RNN Encoder and RNN Decoder
43
◻A RNN encoder-decoder (RED) learns a probability
distribution over the next-query in the session given
the previous one.
◻ Backprop Training:
S = cleveland gallery → lake erie art
cleveland gallery </q>
P(lake)
lake
P(erie)
erie art
P(art) P(</q>)
Hierarchical Recurrent Encoder
Decoder (HRED)
44
P(lake)
lake
P(erie)
erie art
P(art) P(</q>)
cleveland
P(indian)
indian art
P(art) P(</q>)P(cleveland)
lake erie art </q>cleveland gallery </q>
cleveland gallery → lake erie art → cleveland indian art
Session-level
recurrent states
summarize past
session context.
Example of synthetic generation
45
Results
46
Short (2 queries)
Medium (3 - 5 queries)
Long sessions (> 5 queries)
Biggest improvements of HRED
on medium and long sessions.
Conclusions
RecSys 2013
Conclusions
48
◻ Query recommendation is a useful tool for frontend
and backend.
◻ 50% of the queries are long tail or unseen queries →
you need to deal with them.
◻ Term-query graph is a useful model to deal with
long tails queries → but the efficient framework for
graph subgraph extraction can be used for other
problems.
◻ Do not rely always on the keywords used by the
users.
Conclusions
49
◻ Using deep learning we are actually able to build
compact models that are generating queries like
humans.
References
50
◻ Hossein Vahabi, Margareta Ackerman, David Loker, Ricardo Baeza-Yates, and
Alejandro Lopez-Ortiz. 2013. Orthogonal query recommendation. In Proceedings
of the 7th ACM conference on Recommender systems (RecSys '13). ACM, New
York, NY, USA, 33-40. DOI=http://dx.doi.org/10.1145/2507157.2507159
◻ Alessandro Sordoni, Yoshua Bengio, Hossein Vahabi, Christina Lioma, Jakob Grue
Simonsen, and Jian-Yun Nie. 2015. A Hierarchical Recurrent Encoder-Decoder for
Generative Context-Aware Query Suggestion. In Proceedings of the 24th ACM
International on Conference on Information and Knowledge Management (CIKM
'15). ACM, New York, NY, USA, 553-562.
DOI=http://dx.doi.org/10.1145/2806416.2806493
◻ Francesco Bonchi, Raffaele Perego, Fabrizio Silvestri, Hossein Vahabi, and Rossano
Venturini. 2012. Efficient query recommendations in the long tail via center-piece
subgraphs. In Proceedings of the 35th international ACM SIGIR conference on
Research and development in information retrieval (SIGIR '12). ACM, New York,
NY, USA, 345-354. DOI: https://doi.org/10.1145/2348283.2348332
References
51
◻ Paolo Boldi, Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides
Gionis, and Sebastiano Vigna. 2008. The query-flow graph: model and
applications. In Proceedings of the 17th ACM conference on Information
and knowledge management (CIKM '08). ACM, New York, NY, USA, 609-
618. DOI=http://dx.doi.org/10.1145/1458082.1458163
◻ Ricardo Baeza-Yates and Alessandro Tiberi. 2007. Extracting semantic
relations from query logs. In Proceedings of the 13th ACM SIGKDD
international conference on Knowledge discovery and data mining (KDD
'07). ACM, New York, NY, USA, 76-85. DOI:
https://doi.org/10.1145/1281192.1281204

Query Recommendation - Barcelona 2017

  • 1.
    Query Recommendation Search &Recommendation Puya - Hossein Vahabi Barcelona 2017
  • 2.
  • 3.
  • 4.
    Search Engine Architecture 4 SpellingCorrection Query Expansion + Recommendation Query Document Retrieval Ranking of the results
  • 5.
  • 6.
  • 7.
    Search Engine Architecture 7 SpellingCorrection Query Expansion + Recommendation Query Document Retrieval Ranking of the results
  • 8.
  • 9.
    Problem Definition 9 ◻ Givena query q, we want to find a set of related queries q_1, q_2, …, q_k and be able to rank them.
  • 10.
    Query Suggestion SOTA 10 ◻Query-Flow Graph and Term-Query Graph [Bonci et al. 2008, Vahabi et al. 2012] Robust to long-tail queries but computationally complex ◻ Context-awareness by VMM models [He et al. 2009, Cao et al. 2008] Sparsity issues and not robust to long-tail queries
  • 11.
    Query Suggestion SOTA 11 ◻Learning to rank by featurizing query context [Shokhoui et al. 2013, Ozertem et al. 2012] Order of queries / words in the queries is often lost ◻ Synthetic queries by template-based approaches [Szpektor et al. 2011, Jain et al. 2012]
  • 12.
  • 13.
    13 ◻ Rare andnever-seen queries account for more than 50% of the traffic! Rare and never-seen queries Queries ordered by popularity Popularity
  • 14.
    Efficient Query Recommendations inthe Long Tail via Center-Piece Subgraphs SIGIR 2012
  • 15.
  • 16.
    1° Step: RWR 16 Query:lower heart rate
  • 17.
    1° Step: RWR 17 Query:lower heart rate
  • 18.
    1° Step: RWR 18 Query:lower heart rate
  • 19.
    Query suggestion viacenter-piece 19 ◻ The importance of a query j w.r.t. a set U of terms is eventually given by the product of scores of j for each node in U RWR: I = > 0.90 J = > 0.10 K => 0.00 I,J,K => 0.33 Center-piece: I = > 0.90 J = > 0.10 K => 0.00 I,J,K => 0.00
  • 20.
    TQ-Graph vs RWRsuggestions 20 TQ-Graph suggestions RWR suggestions Things to lower heart rate Broken heart Lower heart rate through exercise Prime rate Accelerated heart rate and pregnant Exchange rate Web md Bank rate Heart problems Currency exchange rates Unseen query: lower heart rate
  • 21.
    Solution problem: computational cost21 ◻Ω ( m * (|E| + |Q|) ) ⬜m: number of query terms ⬜E: number of graph edges ⬜Q: number of queries
  • 22.
  • 23.
    Effectiveness after Pruning&bucketing 23 Theprobability of the suggested queries that have been replaced are similar to the query suggested originally.
  • 24.
    Scalability: cache miss(%) 24 With 8GB of main memory we have a cache miss rate < 10%.
  • 25.
  • 26.
    What is theproblem? 2 6 Query Recommendation If a user is not satisfied with the answers, a query recommendation can be useful.
  • 27.
    Problem 27 Query Recommendation Looking forFAA (Federal Aviation Administration)
  • 28.
    Problem: which keywordto use? 28 ◻ Sometimes users’ know what they are looking for, but they don’t know which keywords to use: ⬜“Daisy Duke” but looking for “Catherine Bach” ⬜“diet supplement” but looking for “body building supplements” ◻Traditional query recommendation algorithms fails: Why? ⬜Because they are looking for highly related queries
  • 29.
    Definition: orthogonal queries 29 ◻Orthogonal queries are related queries that have (almost) no common terms with the user's query.
  • 30.
    Comparison 30 CG Cover Graph UF-IQFUser Frequency-Inverse Query Frequency OQ Orthogonal Query Recommendation SC Short Cut QFG Query Flow Graph TQG Term-Query Graph SQ Similar Query
  • 31.
  • 32.
    Results: User Studyon top-5 Recommendations 32 In 45% of cases OQ is judged to be useful or somewhat useful. OQ is the best for queries in the long-tail.
  • 33.
    Successful results overlap:S@10 33 OQSUCCEEDS WHERE OTHERS FAIL!
  • 34.
    A Hierarchical RecurrentEncoder- Decoder for Context-Aware Generative Query Suggestion CIKM 2015
  • 35.
    Generative Model 35 ◻Generative -i.e. being able of producing synthetic suggestions that may not exist in the training data. ◻Other Key Properties : 1) robust in the long-tail - word-based approach 2) context-aware - can use an unlimited number of previous queries
  • 36.
    Word and queryembedding 36 ◻Learn vector representations for words and queries encoding their syntactic and semantic characteristics. ⬜“Similar” queries associated to “near” vectors. “game” [ 0.1, 0.05, -0.3, … , 1.1 ] [ 0.35, 0.15, -0.12, … , 1.3 ] “cartoon network game”
  • 37.
    Word and queryembedding 37 word space query space
  • 38.
    Hierarchical Recurrent Encoder Decoder(HRED) 38 P(lake) lake P(erie) erie art P(art) P(</q>) cleveland P(indian) indian art P(art) P(</q>)P(cleveland) lake erie art </q>cleveland gallery </q> cleveland gallery → lake erie art → cleveland indian art Session-level recurrent states summarize past session context.
  • 39.
    Recurrent Neural Networks(RNNs) 39 Word Embeddings ◻ The weight matrices W and U are fixed throughout the timesteps. W U W U cleveland gallery </q>Input Query W U Recurrent states Initialization, 0 vector
  • 40.
    RNN encoder 40 ◻Aggregates wordembeddings ◻The last recurrent state is used as the query embedding. ◻The query embedding is sensitive to the order of words in the query! Query embedding cleveland gallery </q>Input Query
  • 41.
    RNN Decoder 41 ◻Probabilistic mappingfrom query embeddings to textual queries P(Q|x). O Input query embedding lake O erie art O O
  • 42.
    RNN Encoder andRNN Decoder 42 cleveland gallery </q>Input Query P(lake) lake P(erie) erie art P(art) P(</q>) Input query embedding Output query embedding
  • 43.
    RNN Encoder andRNN Decoder 43 ◻A RNN encoder-decoder (RED) learns a probability distribution over the next-query in the session given the previous one. ◻ Backprop Training: S = cleveland gallery → lake erie art cleveland gallery </q> P(lake) lake P(erie) erie art P(art) P(</q>)
  • 44.
    Hierarchical Recurrent Encoder Decoder(HRED) 44 P(lake) lake P(erie) erie art P(art) P(</q>) cleveland P(indian) indian art P(art) P(</q>)P(cleveland) lake erie art </q>cleveland gallery </q> cleveland gallery → lake erie art → cleveland indian art Session-level recurrent states summarize past session context.
  • 45.
    Example of syntheticgeneration 45
  • 46.
    Results 46 Short (2 queries) Medium(3 - 5 queries) Long sessions (> 5 queries) Biggest improvements of HRED on medium and long sessions.
  • 47.
  • 48.
    Conclusions 48 ◻ Query recommendationis a useful tool for frontend and backend. ◻ 50% of the queries are long tail or unseen queries → you need to deal with them. ◻ Term-query graph is a useful model to deal with long tails queries → but the efficient framework for graph subgraph extraction can be used for other problems. ◻ Do not rely always on the keywords used by the users.
  • 49.
    Conclusions 49 ◻ Using deeplearning we are actually able to build compact models that are generating queries like humans.
  • 50.
    References 50 ◻ Hossein Vahabi,Margareta Ackerman, David Loker, Ricardo Baeza-Yates, and Alejandro Lopez-Ortiz. 2013. Orthogonal query recommendation. In Proceedings of the 7th ACM conference on Recommender systems (RecSys '13). ACM, New York, NY, USA, 33-40. DOI=http://dx.doi.org/10.1145/2507157.2507159 ◻ Alessandro Sordoni, Yoshua Bengio, Hossein Vahabi, Christina Lioma, Jakob Grue Simonsen, and Jian-Yun Nie. 2015. A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM '15). ACM, New York, NY, USA, 553-562. DOI=http://dx.doi.org/10.1145/2806416.2806493 ◻ Francesco Bonchi, Raffaele Perego, Fabrizio Silvestri, Hossein Vahabi, and Rossano Venturini. 2012. Efficient query recommendations in the long tail via center-piece subgraphs. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval (SIGIR '12). ACM, New York, NY, USA, 345-354. DOI: https://doi.org/10.1145/2348283.2348332
  • 51.
    References 51 ◻ Paolo Boldi,Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides Gionis, and Sebastiano Vigna. 2008. The query-flow graph: model and applications. In Proceedings of the 17th ACM conference on Information and knowledge management (CIKM '08). ACM, New York, NY, USA, 609- 618. DOI=http://dx.doi.org/10.1145/1458082.1458163 ◻ Ricardo Baeza-Yates and Alessandro Tiberi. 2007. Extracting semantic relations from query logs. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '07). ACM, New York, NY, USA, 76-85. DOI: https://doi.org/10.1145/1281192.1281204