Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A picture is worth a thousand words

707 views

Published on

Slides of my 'Haystack - The search relevance conference' talk on approaches to relevance scoring based on product data. The fist part gives an overview on scoring in e-commerce search, the second part explains a new approach to relevance scoring based on image recognition

Published in: Data & Analytics
  • Hi René - sorry for not following up on your detailed response, thanks for taking the time. Yes, I am using LSHs in Solr as a fast way to get candidates that are then re-ranked based on the vector similarity. Lots of tuning, unfortunately, to hit the right balance between size of hashes and number of hashes. A chat would be great, what works best for you?
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • @kkrugler I'm sorry I didn't see your question earlier. I used Querqy with uq.similarityScore=off as a request parameter. You will find further (experimental) settings in the source code comments here: https://github.com/renekrie/querqy/blob/master/querqy-for-lucene/querqy-solr/src/main/java/querqy/solr/QuerqyDismaxQParser.java. For this specific setting, the idea is that in e-commerce normalised tf and idf don't make much sense. The tie param (cf. edismax) and other ranking factors (like popularity), which I didn't use in the experiments for the talk, will become more important. Re using full feature vectors: this didn't make much sense here, as I needed the overlap of the query result with the hash values (similar to clusters?) as a scoring factor. However, I've played with storing vectors as binary fields and unpacking the vectors felt reasonably fast - though I didn't explore this systematically. Are you experimenting in this domain as well? - Feel free to get in touch for a chat!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hi René - thanks for sharing, this is really good stuff! I had two questions. First, what was the querqy configuration for the second-to-last row on slide 36? And second, did you every try using the hashes as an initial query for candidates, and then use the actual (stored) feature vector to calculate a more accurate similarity for ranking? Thanks again!
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

A picture is worth a thousand words

  1. 1. A picture is worth a thousand words Approaches to relevance scoring based on product data, including image recognition René Kriegler, @renekrie Haystack - The Search Relevance Conference 11 April 2018
  2. 2. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) About me More than 10 years experience as a freelance search consultant, often in a role for OpenSource Connections Focus: - Search relevance optimisation - E-commerce search - Solr - Coaching teams to establish search within their organisation Organiser of MICES - Mix-Camp E-commerce Search (Berlin, 13 June, mices.co, call for talks open until 22 April) Maintainer of Querqy (OSS query rewriting library - github.com/renekrie/querqy) 2
  3. 3. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) E-commerce search E-Commerce Search as part of the ‘buying decision process’ - Search can/should be optimised towards the different stages of the buying decision process - Purchase as one signal of a successful search Philip Kotler, Kevin Lane Marketing Management 1997 Peter Morville Ambient Search, 2005 Problem recognition Information search Evaluation of alternatives Purchase decision Post- purchase behaviour 3
  4. 4. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Relevance in e-commerce search Unlike in other search domains, documents in e-commerce search describe a single item - each document is a ‘proxy’ for a concrete thing that we could touch/examine in a shop 4
  5. 5. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Relevance in e-commerce search Unlike in other search domains, documents in e-commerce search describe a single item - each document is a ‘proxy’ for a concrete thing that we could touch/examine in a shop Consumer interests become part of relevance criteria: - Product specification (Does the SSD drive of that laptop have enough capacity for me?) - Value / price - Availability (Wait three weeks for a pair of shoes?) - Brand reputation - Seasonality / freshness - Reviews / ratings - ... 5
  6. 6. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Relevance in e-commerce search O. Alonso, S. Mizzaro: Relevance Criteria for E-Commerce: A Crowdsourcing-based Experimental Analysis, SIGIR ‘09, 2009. 6
  7. 7. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) The seller perspective How can search result ranking maximise profit? - Show results most relevant to the user - Maximise margin - Sales, stock clearance - Sell search result placements (see Amazon’s ‘Sponsored by ...’) 7
  8. 8. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Ranking factors Search result ranking factors in e-commerce search - Topicality - identify the product (type) that the user is searching for (‘laptop’ vs ‘laptop backpack’) - User’s relevance criteria (e-commerce/non-ecommerce) - Seller’s interests (maximise profit) - Personalisation & individualisation 8
  9. 9. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Ranking factors Search result ranking factors in e-commerce search - Topicality - identify the product (type) that the user is searching for (‘laptop’ vs ‘laptop backpack’) - User’s relevance criteria (e-commerce/non-ecommerce) - Seller’s interests (maximise profit) - Personalisation & individualisation I will focus on topicality for the rest of my talk 9
  10. 10. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Standard scoring models Standard scoring models evolved with enterprise search/general web search in mind: Typically - Long documents - unstructured/semi-structured - mixture of many, often abstract topics 10
  11. 11. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Standard scoring models Standard scoring models evolved with enterprise search/general web search in mind: Typically - Long documents - unstructured/semi-structured - mixture of many, often abstract topics Compare with e-commerce search: Typically - Short documents - Fields - About a single, concrete thing (‘proxy’) 11
  12. 12. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Standard scoring models Often based on language model that tries to predict the query likelihood given document/index term distributions: Score = f(tf, df) (See tf*idf, BM25(F)) 12
  13. 13. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Standard scoring models Score = f(tf, df) Both, tf and df have problems in e-commerce search: - Unclear - often adverse - interaction of tf and df with fields - tf often equals 1 13
  14. 14. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Standard scoring models Score = f(tf, df) Both, tf and df have problems in e-commerce search: - Unclear - often adverse - interaction of tf and df with fields - tf often equals 1 - doc length normalisation of tf often doesn’t work: - Acer Aspire E5-523-962Z - Laptop 2.9GHz A9-9410 15.6" 1366 x 768pixels Black 14
  15. 15. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Standard scoring models Score = f(tf, df) Both, tf and df have problems in e-commerce search: - Unclear - often adverse - interaction of tf and df with fields - tf often equals 1 - doc length normalisation of tf often doesn’t work: - Acer Aspire E5-523-962Z - Laptop 2.9GHz A9-9410 15.6" 1366 x 768pixels Black - Laptop 15
  16. 16. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Standard scoring models Score = f(tf, df) Both, tf and df have problems in e-commerce search: - Unclear - often adverse - interaction of tf and df with fields - tf often equals 1 - doc length normalisation of tf often doesn’t work - Counter-intuitive: - If two documents describe a laptop, they should both have the same topicality score regardless of the distribution of the terms in their description 16
  17. 17. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) E-commerce scoring models Few scoring models were designed specifically for e-commerce search Predict product type and product properties from indexed product data and from the query and match at query time - SEMKNOX search engine (based on ontology) - Product type prediction from query at Amazon (D. Sorokina, E. Cantú-Paz, The Joy of Ranking Products, SIGIR ‘16, 2016) => Score tends to become binary (match vs no match) - Great intuition (a laptop shouldn’t be more ‘laptopish’ than the other) - Less noisy input for combination with other ranking factors 17
  18. 18. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is worth a thousand words Query: laptop 18
  19. 19. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is worth a thousand words Query: laptop 19
  20. 20. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is worth a thousand words Query: laptop 20 Laptop Laptop backpack
  21. 21. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is worth a thousand words Product pictures fit the ‘proxy’ metaphor nicely - they visually represent the real-world product that the document stands for Image recognition needed to explore product pictures for search -> model product type (and properties) Image recognition already being explored for e-commerce search: - nyris.io: known-item search - cerebel.io and Han Xiao, Zalando research (https://bit.ly/2EdQwtc): joint visual/textual search model 21
  22. 22. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is worth a thousand words Can image recognition be used for search in a simpler way? 22
  23. 23. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is worth a thousand words Can image recognition be used for search in a simpler way? Maybe just for scoring? 23
  24. 24. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Image-based relevance scoring Inception 3 Image recognition (Tensorflow) Acer Aspire E5-523-962Z - Laptop 2.9GHz A9-9410 15.6" 1366 x 768pixels Black Recognize image Output vector (Softmax): x000: 0.00145 x001: 0.00030 ... x711: 0.79200 (laptop) ... x999: 0.00801 Acer Aspire E5-523-962Z - Laptop 2.9GHz A9-9410 15.6" 1366 x 768pixels Black Image recognition output vector [...] Enrich documents with image recognition output vectors during indexing 24
  25. 25. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Intuition for scoring Likelihood of query ‘notebook’ in vector subspaces + + + + + + + - - Space of indexed Inception output vectors 25
  26. 26. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Intuition for scoring Likelihood of query ‘notebook’ in vector subspaces + + + + + + + - - Higher score for query ‘notebook’ for documents having these images (5/5 vs 2/4) Space of indexed Inception output vectors 26
  27. 27. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Towards a scoring formula Score ~ Likelihood of query given an image recognition vector subspace - Likelihood could be estimated but would assign too high a score to specific product subtypes (such as ‘running shoes’ for query ‘shoes’) - Better: Score ~ Jaccard similarity(products in vector subspace, products that match the query) 27
  28. 28. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Towards a scoring formula Defining vector subspaces - Split space by random hyperplanes -> Random Projection Tree - Use more than one tree to reduce impact of hyperplanes that run through a group of closely related images -> Random Projection Forest - Per document: index few random projections instead of high-dimensional vector 28
  29. 29. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Using random projection forests 29
  30. 30. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Using random projection forests 30
  31. 31. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Using random projection forests 31
  32. 32. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Using random projection forests V1 => “11” (or “3”) V2 => “11” (or “3”) V3 => “00” (or “0”) V4 => “01” (or “1”) 32 Great video: Maciej Kula - Speeding up search with locality sensitive hashing: https://www.youtube.com/watch?v=NtA KQIrIU7w
  33. 33. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Demo Solr plugin demo Many thanks to Profitmax (http://testit.de & http://preisvergleich.ch) for letting me use their product data for this demo 33
  34. 34. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Using random projection forests A forest of 16 trees à 24 hyperplanes in Solr. We can work with fewer trees and hyperplanes at query time (for example, use p_tree_2:010* to query 3 hyperplanes in tree p_tree_2) 34
  35. 35. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Search quality comparison Experiment: - Solr plugin to implement scoring based on image recognition - Index product data - Calculate search quality metrics for 100 queries, based on judgments derived from live traffic - Compare with other scoring algorithms A great ‘Thank you’ to otto.de for letting me use their product data and search judgment data! 35
  36. 36. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Search quality comparison 36
  37. 37. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Search quality comparison Scoring based on image recognition - Implemented using a random projection forest of 16 trees à 5 hyperplanes - Scored by sum of Jaccard Similarities between documents in vector subspaces and documents that match category query tokens only - no additional tf*idf scoring => Image-recognition based scoring on a par with best language model based scoring in experiment 37
  38. 38. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Further improvements Future improvements/experiments: - Use language model based scoring as tie-breaker for documents that yield the same score based on image recognition - Combine with Jaccard Similarity of further query fields (beyond category) - Retrain image recognition for product properties, combine with model for product types - Tag document ‘offline’: weigh document terms using the same intuition (= term likelihood given the image recognition vector) 38
  39. 39. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is a worth a thousand words W. Di, N. Sundaresan, R. Piramuthu, A. Bhardwaj: Is a Picture Really Worth a Thousand Words? - On the Role of Images in E-commerce. WSDM ‘14. 2014 39
  40. 40. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is a worth a thousand words W. Di, N. Sundaresan, R. Piramuthu, A. Bhardwaj: Is a Picture Really Worth a Thousand Words? - On the Role of Images in E-commerce. WSDM ‘14. 2014 It’s at least worth a language model! ;-) 40
  41. 41. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Thank you! http://www.rene-kriegler.com @renekrie Product images taken from Icecat open catalogue (icecat.biz) and preisvergleich.ch product data 41

×