Query Recommendation - Barcelona 2017

Query Recommendation
Search & Recommendation
Puya - Hossein Vahabi
Barcelona 2017

Search Engine Architecture
4
Spelling Correction
Query Expansion + Recommendation
Query
Document
Retrieval
Ranking of the results

Search Engine Architecture
7
Spelling Correction
Query Expansion + Recommendation
Query
Document
Retrieval
Ranking of the results

Problem Definition
9
◻ Given a query q, we want to find a set of related
queries q_1, q_2, …, q_k and be able to rank
them.

Query Suggestion SOTA
10
◻ Query-Flow Graph and Term-Query Graph
[Bonci et al. 2008, Vahabi et al. 2012]
Robust to long-tail queries but computationally
complex
◻ Context-awareness by VMM models [He et al.
2009, Cao et al. 2008]
Sparsity issues and not robust to long-tail
queries

Query Suggestion SOTA
11
◻ Learning to rank by featurizing query context
[Shokhoui et al. 2013, Ozertem et al. 2012]
Order of queries / words in the queries is often
lost
◻ Synthetic queries by template-based
approaches [Szpektor et al. 2011, Jain et al.
2012]

13
◻ Rare and never-seen queries account for more
than 50% of the traffic!
Rare and never-seen queries
Queries ordered by popularity
Popularity

Efficient Query Recommendations
in the Long Tail via Center-Piece
Subgraphs
SIGIR 2012

TQ-Graph: term centric
15
99% of Coverage

1° Step: RWR
16
Query: lower
heart rate

1° Step: RWR
17
Query: lower
heart rate

1° Step: RWR
18
Query: lower
heart rate

Query suggestion via center-piece
19
◻ The importance of a query j w.r.t. a set U of terms is
eventually given by the product of scores of j for
each node in U
RWR:
I = > 0.90
J = > 0.10
K => 0.00
I,J,K => 0.33
Center-piece:
I = > 0.90
J = > 0.10
K => 0.00
I,J,K => 0.00

TQ-Graph vs RWR suggestions
20
TQ-Graph suggestions RWR suggestions
Things to lower heart rate Broken heart
Lower heart rate through exercise Prime rate
Accelerated heart rate and
pregnant
Exchange rate
Web md Bank rate
Heart problems Currency exchange rates
Unseen query: lower heart rate

Solution problem: computational
cost21
◻ Ω ( m * (|E| + |Q|) )
⬜m: number of query terms
⬜E: number of graph edges
⬜Q: number of queries

Speeding-up subgraph extraction
22

Effectiveness after Pruning&bucketing
23
The probability of
the suggested
queries that have
been replaced are
similar to the
query suggested
originally.

Scalability: cache miss (%)
24
With 8GB of main
memory we have a
cache miss rate
< 10%.

Orthogonal Query
Recommendation
RecSys 2013

What is the problem?
2
6
If a user is not satisfied with the
answers, a query recommendation
can be useful.

Problem
27
Looking for FAA
(Federal Aviation
Administration)

Problem: which keyword to use?
28
◻ Sometimes users’ know what they are looking for,
but they don’t know which keywords to use:
⬜“Daisy Duke” but looking for “Catherine Bach”
⬜“diet supplement” but looking for “body building
supplements”
◻Traditional query recommendation algorithms fails:
Why?
⬜Because they are looking for highly related queries

Definition: orthogonal queries
29
◻ Orthogonal queries are related queries that have
(almost) no common terms with the user's query.

Comparison
30
CG Cover Graph
UF-IQF User Frequency-Inverse Query
Frequency
OQ Orthogonal Query
Recommendation
SC Short Cut
QFG Query Flow Graph
TQG Term-Query Graph
SQ Similar Query

Results: User Study on top-5
Recommendations
32
In 45% of cases OQ is
judged to be useful or
somewhat useful.
OQ is the best for
queries in the
long-tail.

Successful results overlap:S@10
33
OQ SUCCEEDS WHERE OTHERS FAIL!

A Hierarchical Recurrent Encoder-
Decoder for Context-Aware
Generative Query Suggestion
CIKM 2015

Generative Model
35
◻Generative - i.e. being able of producing synthetic
suggestions that may not exist in the training data.
◻Other Key Properties :
1) robust in the long-tail - word-based approach
2) context-aware - can use an unlimited number of
previous queries

Word and query embedding
36
◻Learn vector representations for words and queries
encoding their syntactic and semantic
characteristics.
⬜“Similar” queries associated to “near” vectors.
“game”
[ 0.1, 0.05, -0.3, … , 1.1 ] [ 0.35, 0.15, -0.12, … , 1.3 ]
“cartoon network
game”

Word and query embedding
37
word
space
query
space

Hierarchical Recurrent Encoder
Decoder (HRED)
38
P(lake)
lake
P(erie)
erie art
P(art) P(</q>)
cleveland
P(indian)
indian art
P(art) P(</q>)P(cleveland)
lake erie art </q>cleveland gallery </q>
cleveland gallery → lake erie art → cleveland indian art
Session-level
recurrent states
summarize past
session context.

Recurrent Neural Networks (RNNs)
39
Word Embeddings
◻ The weight matrices W and U are fixed throughout
the timesteps.
W
U
W
U
cleveland gallery </q>Input Query
W
U
Recurrent states
Initialization, 0
vector

RNN encoder
40
◻Aggregates word embeddings
◻The last recurrent state is used as the query
embedding.
◻The query embedding is sensitive to the order of
words in the query!
Query embedding

RNN Decoder
41
◻Probabilistic mapping from query embeddings to
textual queries P(Q|x).
O
Input query
embedding lake
O
erie art
O O

RNN Encoder and RNN Decoder
42
P(lake)
lake
P(erie)
erie art
P(art) P(</q>)
Input query
embedding
Output query
embedding

RNN Encoder and RNN Decoder
43
◻A RNN encoder-decoder (RED) learns a probability
distribution over the next-query in the session given
the previous one.
◻ Backprop Training:
S = cleveland gallery → lake erie art
cleveland gallery </q>
P(lake)
lake
P(erie)
erie art
P(art) P(</q>)

Hierarchical Recurrent Encoder
Decoder (HRED)
44
P(lake)
lake
P(erie)
erie art
P(art) P(</q>)
cleveland
P(indian)
indian art
P(art) P(</q>)P(cleveland)
lake erie art </q>cleveland gallery </q>
cleveland gallery → lake erie art → cleveland indian art
Session-level
recurrent states
summarize past
session context.

Example of synthetic generation
45

Results
46
Short (2 queries)
Medium (3 - 5 queries)
Long sessions (> 5 queries)
Biggest improvements of HRED
on medium and long sessions.

Conclusions
48
◻ Query recommendation is a useful tool for frontend
and backend.
◻ 50% of the queries are long tail or unseen queries →
you need to deal with them.
◻ Term-query graph is a useful model to deal with
long tails queries → but the efficient framework for
graph subgraph extraction can be used for other
problems.
◻ Do not rely always on the keywords used by the
users.

Conclusions
49
◻ Using deep learning we are actually able to build
compact models that are generating queries like
humans.

References
50
◻ Hossein Vahabi, Margareta Ackerman, David Loker, Ricardo Baeza-Yates, and
Alejandro Lopez-Ortiz. 2013. Orthogonal query recommendation. In Proceedings
of the 7th ACM conference on Recommender systems (RecSys '13). ACM, New
York, NY, USA, 33-40. DOI=http://dx.doi.org/10.1145/2507157.2507159
◻ Alessandro Sordoni, Yoshua Bengio, Hossein Vahabi, Christina Lioma, Jakob Grue
Simonsen, and Jian-Yun Nie. 2015. A Hierarchical Recurrent Encoder-Decoder for
Generative Context-Aware Query Suggestion. In Proceedings of the 24th ACM
International on Conference on Information and Knowledge Management (CIKM
'15). ACM, New York, NY, USA, 553-562.
DOI=http://dx.doi.org/10.1145/2806416.2806493
◻ Francesco Bonchi, Raffaele Perego, Fabrizio Silvestri, Hossein Vahabi, and Rossano
Venturini. 2012. Efficient query recommendations in the long tail via center-piece
subgraphs. In Proceedings of the 35th international ACM SIGIR conference on
Research and development in information retrieval (SIGIR '12). ACM, New York,
NY, USA, 345-354. DOI: https://doi.org/10.1145/2348283.2348332

References
51
◻ Paolo Boldi, Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides
Gionis, and Sebastiano Vigna. 2008. The query-flow graph: model and
applications. In Proceedings of the 17th ACM conference on Information
and knowledge management (CIKM '08). ACM, New York, NY, USA, 609-
618. DOI=http://dx.doi.org/10.1145/1458082.1458163
◻ Ricardo Baeza-Yates and Alessandro Tiberi. 2007. Extracting semantic
relations from query logs. In Proceedings of the 13th ACM SIGKDD
international conference on Knowledge discovery and data mining (KDD
'07). ACM, New York, NY, USA, 76-85. DOI:
https://doi.org/10.1145/1281192.1281204

Query Recommendation - Barcelona 2017

More Related Content

What's hot

Similar to Query Recommendation - Barcelona 2017

Recently uploaded

Query Recommendation - Barcelona 2017