Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Query relaxation - A rewriting technique between search and recommendations

380 views

Published on

Slides of my 'Haystack - The search relevance conference' talk about query relaxation. The first part gives a brief overview on strategies to help users out of zero search results situations. The second part focuses on query relaxation. I compare several algorithms that try to find the best term to be dropped from a multi-term zero-results query in order to produce results. The best solutions uses a multi-layer neural network with Word2vec as inputs to find this term.

Published in: Data & Analytics
  • Be the first to comment

Query relaxation - A rewriting technique between search and recommendations

  1. 1. Query relaxation A rewriting technique between search and recommendations René Kriegler, @renekrie Haystack - The Search Relevance Conference 24 April 2019
  2. 2. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) About me More than 10 years experience as a freelance search consultant, often in a role for OpenSource Connections Focus: - Search relevance optimisation - E-commerce search - Solr - Coaching teams to establish search within their organisation Organiser of MICES - Mix-Camp E-commerce Search (Berlin, 19 June, mices.co, right after Berlin Buzzwords) Maintainer of Querqy (OSS query rewriting library - github.com/renekrie/querqy) 2
  3. 3. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) No results 3
  4. 4. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) No results - strategies Apply synonyms and hyponyms (laptop = notebook; shoes => trainers) Spelling correction (Did you mean ...? / We’ve searched for ...) Also search in low-quality data fields Loosen boolean constraints (AND -> OR, mm<100%) Apply hypernyms (boots => shoes) Use more distant semantic relation (beard balm => trimmer) Show more general recommendations (related to user’s shopping history, popular items) 4
  5. 5. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) No results - strategies Apply synonyms and hyponyms Spelling correction Also search in low-quality data fields Loosen boolean constraints Apply hypernyms Use more distant semantic relation Show more general recommendations 5 Explainable? (in e-commerce search) Don’t want to tell mm: no; AND/OR: yes, but bad UX Don’t need to tell Can be hard
  6. 6. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) No results - Query relaxation 6 Explainable! (& conversational!)
  7. 7. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Query relaxation Which query term should be removed? 7
  8. 8. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Query relaxation - intuition 8 iphone 9 => iphone 9 (*) iphone 9 => iphone 9
  9. 9. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Query relaxation - intuition 9 iphone 9 plus => iphone 9 plus (?) iphone 9 plus => iphone 9 plus (?) iphone 9 plus => iphone 9 plus
  10. 10. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Query relaxation - intuition 10 black boots => black boots (*) black boots => black boots
  11. 11. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Query relaxation - intuition 11 purple boots => purple boots (?) purple boots => purple boots
  12. 12. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Query relaxation - intuition 12 (?) usb charger 12v => usb charger 12v (?) usb charger 12v => usb charger 12v (?) usb charger 12v => usb charger 12v
  13. 13. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Query intent & information need Apply synonyms and hyponyms Spelling correction Also search in low-quality data fields Loosen boolean constraints Apply hypernyms Use more distant semantic relation Show more general recommendations 13 Trying to match original information need Remotely related to user intent Query relaxation
  14. 14. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Query relaxation 14 “A popular approach to cope with empty-answers is query relaxation, which attempts to reformulate the original query into a new query, by removing or relaxing conditions, so that the result of the new query is likely to contain the items of interest for that user.” (Mottin et al., 2013) “We present a method which we call relaxation for expanding deductive database and logic programming queries. The set of answers obtained with the relaxation method includes both answers deduced traditionally and answers related in some way with the original query. The relaxation method expands the scope query by relaxing the constraints implicit in the query.” (Gaasterland et al., 1992) “An extended query-document matching system is described in this study that relaxes the stringent requirements of the conventional Boolean retrieval operations.” (Salton et al., 1983)
  15. 15. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Query relaxation 15 => How can we find the best query term to be removed from the query so that “... the result of the new query is likely to contain the items of interest for that user” “... answers [are] related in some way with the original query” ? => How can we test, compare and optimise solutions?
  16. 16. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Online testing 16 Click-through-rate / hit rate Exit rate / time spent on site => Do we manage to keep the user interacting with our site? => similar to recommendations / exploratory search
  17. 17. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Finding the term to be dropped: data sets 17 Data sets for training and evaluation Find pairs: - a long query having 0 results - a corresponding relaxed query having results
  18. 18. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Finding the term to be dropped: data sets 18 FREQ: Query frequencies - Have we observed the original and the relaxed query before? (We want to make sure that we produce a meaningful query.) COOC: Query cooccurrences per session - Have the original and rewritten query occurred together in a session? => Can we find the original/rewritten query pair in tracking data? How often? (more often is better)
  19. 19. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 0 - Drop random term (baseline) 19 Remove a random term from the query
  20. 20. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 1 - Drop shortest term 20 Remove the shortest term from the query
  21. 21. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 2 - Drop shortest non-alphabetical term 21 Remove the shortest term that doesn’t contain any alphabetical character
  22. 22. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 3 - Combined 1 and 2 22 Remove the shortest term that doesn’t contain any alphabetical character, fall back to removing shortest term if all terms have >=1 alphabetical character
  23. 23. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 4/5 - Drop most/least frequent term 23 Remove the term with the highest/lowest index frequency
  24. 24. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 6/7 - Drop term with highest/lowest entropy 24 Remove the term with the highest/lowest entropy across navigational categories
  25. 25. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 8 - Keep most similar query (Word2vec) 25 Use the rewritten query that is most similar to the original query based on Word2vec embeddings [as mentioned in D.Tunkelang, Query relaxation, https://bit.ly/2ItxF3Z]
  26. 26. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Word2vec (CBOW) 26 w (t-2) pepe jeans w(t) projection Input Output slim cut w (t-1) w (t+1) w (t+2) london london Sequence of words
  27. 27. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 8 - Keep most similar query (Word2vec) 27 Use the rewritten query that is most similar to the original query based on Word2vec embeddings Train Word2Vec embeddings - word = query term, window = query - 300 dimensions Use sum of word(=term) vectors to represent the queries (original/rewritten) Calculate cosine similarity between original query and each rewritten query Use rewritten query that is most similar to the original query
  28. 28. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 8 - Keep most similar query (Word2vec) 28 Use the rewritten query that is most similar to the original query based on Word2vec embeddings
  29. 29. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 9 - Keep most similar query (Query2vec) 29 Use the rewritten query that is most similar to the original query based on query embeddings [Grbovic et al., Scalable Semantic Matching of Queries to Ads in Sponsored Search Advertising. SIGIR 2016]
  30. 30. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) ‘Query2vec’ (CBOW) 30 q (t-2) smartphone smartphone 64g q (t) projection Input Output iphone iphone 64g q (t-1) q (t+1) q (t+2) galaxy 64g galaxy 64g Queries in a session
  31. 31. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 9 - Keep most similar query (Query2vec) 31 Use the rewritten query that is most similar to the original query based on Query embeddings
  32. 32. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 10 - MNN with Word2vec input 32 Predict the term to be dropped using a multi-layer neural network (MNN) with Word2vec embeddings as input.
  33. 33. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 10 - MNN with Word2vec input 33 0: 0.01 1:-0.94 ... 300: 0.18 0: 0.63 1: 0.56 ... 300: 0.04 0:-0.59 1: 0.02 ... 300: 0.77 0: 0.00 1: 0.00 ... 300: 0.00 0: 0.00 1: 0.00 ... 300: 0.00 0: 0.00 1: 0.00 ... 300: 0.00 0: 0.00 1: 0.00 ... 300: 0.00 0: 0.00 1: 0.00 ... 300: 0.00 nike boots 11 0: 0 0: 1 0: 0 0: 0 0: 0 0: 0 0: 0 0: 0 2 hidden layers Input Output
  34. 34. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 10 - MNN with Word2vec input 34 Predict the term to be dropped using a multi-layer neural network (MNN) with Word2vec embeddings as input
  35. 35. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 11 - MNN / Word2vec plus wordshape 35 Predict the term to be dropped using a multi-layer neural network (MNN) with Word2vec embeddings and wordshape features as input. Add additional dimensions to the input vector: - Word length - Number of digits - Does the word have an ‘e’ in the penultimate or ultimate position?
  36. 36. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 11 - MNN / Word2vec plus wordshape 36 ... 301: 4.00 302: 0.00 303: 1.00 ... 301: 5.00 302: 0.00 303: 0.00 ... 301: 2.00 302: 2.00 303: 0.00 ... 301: 0.00 302: 0.00 303: 0.00 ... 301: 0.00 302: 0.00 303: 0.00 ... 301: 0.00 302: 0.00 303: 0.00 ... 301: 0.00 302: 0.00 303: 0.00 ... 301: 0.00 302: 0.00 303: 0.00 nike boots 11 0: 0 0: 1 0: 0 0: 0 0: 0 0: 0 0: 0 0: 0 2 hidden layers Input Output
  37. 37. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 11 - MNN / Word2vec plus wordshape 37 Predict the term to be dropped using a multi-layer neural network (MNN) with Word2vec embeddings and wordshape features as input.
  38. 38. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) 11/12 - MNN / Word2vec plus term stats 38 Predict the term to be dropped using a multi-layer neural network (MNN) with Word2vec embeddings and per-field DF or index frequency.
  39. 39. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Conclusion 39 Query relaxation: - best understood as a query recommendation - information need not necessarily matched but relaxed query still related to user intent - can be communicated nicely to the user (‘conversational’) Best approach to find term to be dropped: - Multi-layer neural network with Word2Vec plus wordshape features as inputs. It can be extended to incorporate further features and optimisation targets.
  40. 40. Query relaxation - a rewriting technique between search and recommendations, Haystack, 24 April 2019, © René Kriegler (@renekrie) Thank you! http://www.rene-kriegler.com @renekrie 40

×