Tutorial on query auto-completion

Образец заголовка
Tutorial on Query Auto-
Completion
Yichen Feng
feng36 AT illinois DOT edu
University of Illinois at Urbana-
Champaign
Prepared as an assignment for CS410: Text Information Systems in Spring 2016

Образец заголовкаQuery Auto-Completoion
• What is Query Auto-Completion (QAC)
– Giving search suggestions based on typed
prefixes by considering the search history log,
search queries popularity, temporal factors
and personal interests.

Образец заголовкаQAC is important
• Faster users’ input, improve efficiency
• Suggesting possible queries
• Correct users’ typing errors
• Users may not know how to describe the
information he needed
• Speed and Accuracy
• Minimize users’ cognitive and physical
effort

Образец заголовкаQAC is Everywhere
PIAZZA Facebook
Gmail Amazon
USA Government Coursera

Образец заголовкаMost Popular Completion
• Traditional QAC (Most Popular Completion)
– Query are suggested from the previous query
popularity. (Mawarkar and Malemath, 2015)
– Ranked by queries’ number of frequent
occurances
– Data Structure: TRIE
– 𝑀𝐶𝑃 𝒫 = arg max
𝑞∈∁(𝒫)
𝑤 𝑞 , 𝑤 𝑞 =
𝑓(𝑞)
𝑖∈𝒬 𝑓(𝑖)
– Ranked by queries’ number of frequent occurances
– Data Structure: TRIE
– Always treated as baseline

Образец заголовкаQAC Challenges
• Cannot catch the popular temporal topics
• Cannot treat different people differently
• Cannot interact with users’ behaviors (e.g.
clicks)
• Bad performance on the mobile devices
• Needed to be optimized

Образец заголовкаSolutions
• Time-sensitive QAC
– Robust vs. Recent
• Personalized QAC
– User behaviors
– Context based QAC
• Time-sensitive Personalized QAC (Hybrid
model)
• Optimizing search results presentation
• Term by term QAC for mobile search
• QAC for rare prefixes

Образец заголовкаTime-Sensitive QAC
(SIGIR 12)
• Time-sensitive: query popularity changing over time
– “di-”: Dictionary for weekday, Disney for weekend
• Key idea:
– Predicting query popularity
• Forecast quality
• Success & failure analysis
• Temporal model selector
– Rely on shorter but frequent aggregation of data, model
the overall query trends by time-series.
• Method: Time-sensitive auto-completion
– 𝑇𝑆 𝒫, 𝑡 = arg max
𝑞∈∁(𝒫)
𝑤 𝑞|𝑡 , 𝑤 𝑞|𝑡 =
𝑦𝑡(𝑞)
𝑖∈𝒬 𝑦𝑡(𝑖)
– 𝑦𝑡(𝑞): estimated frequency of query q at time t
M. Shokouhi and K. Radinsky. Time-sensitive query auto-completion. In SIGIR ’12, pages 601–610, 2012.

Образец заголовкаTS QAC – Recent vs. Robust
(WWW 14)
• QAC need to sufficiently rank both consistently and recently
popular queries
• Motivation: Finding optimal trade-off between recency and
robustness to achieve better QAC
• Key idea:
– Optimal tradeoff could be researched
– Each query log scenario has different temporal characteristics
• Approaches:
– Based on past popularity distributions
• Maximum Likelihood Estimation, Recent Maximum Likelihood Estimation,
Last N Query Distribution
– Based on short-range predicted query popularity
• Predicted Next N Query Distribution
– Meta approach – optimize the parameters of above apporaches
• Online Parameter Learning
S. Whiting, J. McMinn, and J. Jose. Exploring real-time temporal query auto-completion. In DIR Workshop ’13, pages 12–15

Образец заголовкаPersonalized QAC
(SIGIR 13)
• QAC need to suggest people differently by considering their
own interestes
• Motivation: Queries likelihoods vary drastically between
different demographic groups [Weber and Castillo, 2010] and
individuals [Teevan et al., 2011]
• Key idea:
– Features based on: Users age, gender, location, short- and long-
term history
– Novel supervised framework for leaning to personalize QAC
• Method:
– Similar labelling strategy
• Evaluating by using Mean-Reciprocal-Rank (MRR)
– Learning to rank
• Lambda-MART algorithm (boosted decision trees)
• Location is more effective
M. Shokouhi. Learning to personalize query auto-completion. In SIGIR’13 2013

Personalized QAC – Context
Based
(IJARCET 2015)
• Query auto-completer try to accurately predicted what user is typing
• Objective: Improve search quality by predicting the user’s query
based on context
• Key idea:
– Context
• Query similarity
• User’s recent click throughs
• Current location and time
• Keywords and sessions
• Method:
– Most Popular Completion
• Works well when context is empty
– Nearest Completion
• Works well when context exists, terrible when context is empty
– Hybrid Completion
• Combine both MPC and NC
V. Mawarkar and V. Malemath. Context Based Query Auto-Completion. In IJARCET, Volume 4 Issue 6, June 2015.

Образец заголовкаContext Based HCA
(IJARCET 2015)
V. Mawarkar and V. Malemath. Context Based Query Auto-Completion. In IJARCET, Volume 4 Issue 6, June 2015.

Образец заголовкаPersonalized QAC – User Behaviors
(SIGIR14)
• Objective: Explaining the users’ interaction
data to future improving the QAC
performance
• Contributions:
– First set High-resolution QAC query log:
• Recording every keystroke- Enable further analysis on
understanding
– Horizontal skipping bias
• First introduce and unique to QAC
– Vertical position bias
– Two-dimensional Click Model
• Model users’ behavior on PC and mobile devices
Y. Li, A. Dong, H. Wang, H. Deng, Y. Chang, C. Zhai. A Two-dimensional Click Model for Query Auto-completion. In SIGIR’ 2014

Образец заголовкаTwo-Dimensional Click Model
(SIGIR14)
H Model
D Model
Y. Li, A. Dong, H. Wang, H. Deng, Y. Chang, C. Zhai. A Two-dimensional Click Model for Query Auto-completion. In SIGIR’ 2014

Образец заголовкаTime–Sensitive Personalized QAC
(CIKM14)
• Key idea:
– Hybrid model
• Time-sensitivity
• Personalization
– Optimal time window
• Achieving better predition
• Contributions:
– Novel Hybrid Model
– New query popularity prediction method
• Ranking with Mean Reciprocal Rank (MRR)
– Effectiveness analysis
• Significantly outperforms state-of-art time-sensitive
QAC
F. Cai, S. Liang, M. D. Rijke. Time-sensitive Personalized Query Auto-completion. In CIKM’ 2014

Образец заголовкаTSP QAC Performances
(CIKM14)
• Tradeoff between recent and periodicity
– Have critical parameter setting for accuracy
• Baselines check
– Marginally outperforms baselines
• Fact not strongly differential features
– Effective with a longer prefix
– Available evidence matters
• Better QAC ranking
– Sufficient personal queries
– Time-sensitive popularity
F. Cai, S. Liang, M. D. Rijke. Time-sensitive Personalized Query Auto-completion. In CIKM’ 2014

Presenting Optimized Search
Results
(WSDM16)
• Objective:
– Selectively presenting query based on a
probabilistic model to achieve optimized search
results presentation
• Key ideas:
– Time-consuming on too many query suggestions
– Measuring the users’ time-loss
– Patient users get more benefits
• Challenges:
– Uncertain factors (e. g. intent, query suggestion
click probabilities)
– Unclear of how long users spend on scanning
M. P. Kato, K. Tanaka. To Suggest, or Not to Suggest for Queries with Diverse Intents: Optimizing Search Result Presentation. In WSDM’ 2016

Presenting Optimized Search
Results
(WSDM16)
• Contributions:
– Searcher model
• Interacting with query suggestions
• According to users’ multiple intents
– Optimizing Search Results Presentation (OSRP)
• Mainly focusing on ambiguous or underspecified query
– Examined effects of query suggestion on search
behaviors
• Conducting user survey
– Effectiveness of OSRP
• Patient users
• Queries with limited number of intents

Образец заголовкаUsers Survey
(WSDM16)
SERP (M. P. Kato and K. Tanaka)

Term-by-Term QAC for Mobile
Search
(WSDM16)
• Objective:
– Specialized QAC for mobile search
• Mobile Input:
– Small screen Term-by-Term QAC
– Slower input High quality QAC
– Clumsier QAC matters more than PC
• Key idea:
– Faster exploration of suggestions
– Fits for the text editing in mobile devices
S. Vargas, R. Blanco, P. Mika. Term-by-Term Query Auto-Completion for Mobile Search. In WSDM 2016

Образец заголовкаQuery-Term Graph
(WSDM16)
– Based on previous submitted queries
– Efficient way of
• Storing
• Retrieving

Образец заголовкаQAC for Rare Prefixes
(CIKM15)
• Motivation: QAC fail when the prefix is
sufficiently rare
• Key ideas:
– Supervised model ranking synthetic
suggestions
– Query generated by mining query suffixes
– Exploring new ranking signals
• Query n-gram statistics
• Deep convolutional latent semantic model (CLSM)

Образец заголовкаModel and Features
(CIKM15)
• LambdaMART model：
– Ranking using features
• N-gram based features
– Model the likelihood that candidate
suggestion is generated by the same LM as
the queries in the search logs
• CLSM based features
– Based on clickthrough data
– Effective for modelling query-document
relevance
– Training on a prefix-suffix pairs datasetB. Mitra, N. Craswell. Query Auto-Completion for Rare Prefixes. In CIKM 2015

Образец заголовкаQAC for Rare Prefixes
(CIKM15)
• Motivation: QAC fail when the prefix is
sufficiently rare
• Key ideas:
– Supervised model ranking synthetic
suggestions
– Query generated by mining query suffixes
– Exploring new ranking signals
• Query n-gram statistics
• Deep convolutional latent semantic model (CLSM)
B. Mitra, N. Craswell. Query Auto-Completion for Rare Prefixes. In CIKM 2015

Образец заголовкаFuture works
• Short range query popularity prediction
• Complex relationships between users’
behavior at different keystrokes
• More complex click models
• Model personalized temporal patterns for
active users (e.g. Professional searchers)
• Online user behavior study on mobile
• Other LM on rare prefixes

Образец заголовкаQAC Development Summary

Образец заголовкаReferences
1. M. Shokouhi and K. Radinsky. Time-sensitive query auto-completion. In SIGIR ’12, pages
601–610, 2012.
2. S. Whiting, J. McMinn, and J. Jose. Exploring real-time temporal query auto-completion. In
DIR Workshop ’13, pages 12–15
3. M. Shokouhi. Learning to personalize query auto-completion. In SIGIR’13 2013
4. V. Mawarkar and V. Malemath. Context Based Query Auto-Completion. In IJARCET, Volume
4 Issue 6, June 2015.
5. Y. Li, A. Dong, H. Wang, H. Deng, Y. Chang, C. Zhai. A Two-dimensional Click Model for
Query Auto-completion. In SIGIR’ 2014
6. F. Cai, S. Liang, M. D. Rijke. Time-sensitive Personalized Query Auto-completion. In CIKM’
2014
7. M. P. Kato, K. Tanaka. To Suggest, or Not to Suggest for Queries with Diverse Intents:
Optimizing Search Result Presentation. In WSDM’ 2016
8. S. Vargas, R. Blanco, P. Mika. Term-by-Term Query Auto-Completion for Mobile Search. In
WSDM 2016
9. B. Mitra, N. Craswell. Query Auto-Completion for Rare Prefixes. In CIKM 2015
10. L. Li, H. Deng, A. Dong, Y. Chang, H. Zha, R. Baeza-Yates. Analyzing User’s Sequential
Behavior in Query Auto-Completion via Markov Processes. In Proc. SIGIR’15 2015.
11. M. Shokouhi. Detecting seasonal queries by time-series analysis. In Proc. SIGIR, pages
1171–1172, Beijing, China, 2011
12. R. W. White and G. Marchionini. Examining the effectiveness of real-time query expansion.
Inf. Process. Manage., 43:685–704, May 2007
13. Z. Bar-Yossef and N. Kraus. Context-sensitive query auto-completion. In WWW ’11, pages
107–116, 2011.

Tutorial on query auto-completion

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (14)

Similar to Tutorial on query auto-completion

Similar to Tutorial on query auto-completion (20)

Recently uploaded

Recently uploaded (20)

Tutorial on query auto-completion