More Related Content

Similar to A picture is worth a thousand words(20)

Recently uploaded(20)

A picture is worth a thousand words

  1. A picture is worth a thousand words Approaches to relevance scoring based on product data, including image recognition René Kriegler, @renekrie Haystack - The Search Relevance Conference 11 April 2018
  2. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) About me More than 10 years experience as a freelance search consultant, often in a role for OpenSource Connections Focus: - Search relevance optimisation - E-commerce search - Solr - Coaching teams to establish search within their organisation Organiser of MICES - Mix-Camp E-commerce Search (Berlin, 13 June, mices.co, call for talks open until 22 April) Maintainer of Querqy (OSS query rewriting library - github.com/renekrie/querqy) 2
  3. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) E-commerce search E-Commerce Search as part of the ‘buying decision process’ - Search can/should be optimised towards the different stages of the buying decision process - Purchase as one signal of a successful search Philip Kotler, Kevin Lane Marketing Management 1997 Peter Morville Ambient Search, 2005 Problem recognition Information search Evaluation of alternatives Purchase decision Post- purchase behaviour 3
  4. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Relevance in e-commerce search Unlike in other search domains, documents in e-commerce search describe a single item - each document is a ‘proxy’ for a concrete thing that we could touch/examine in a shop 4
  5. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Relevance in e-commerce search Unlike in other search domains, documents in e-commerce search describe a single item - each document is a ‘proxy’ for a concrete thing that we could touch/examine in a shop Consumer interests become part of relevance criteria: - Product specification (Does the SSD drive of that laptop have enough capacity for me?) - Value / price - Availability (Wait three weeks for a pair of shoes?) - Brand reputation - Seasonality / freshness - Reviews / ratings - ... 5
  6. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Relevance in e-commerce search O. Alonso, S. Mizzaro: Relevance Criteria for E-Commerce: A Crowdsourcing-based Experimental Analysis, SIGIR ‘09, 2009. 6
  7. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) The seller perspective How can search result ranking maximise profit? - Show results most relevant to the user - Maximise margin - Sales, stock clearance - Sell search result placements (see Amazon’s ‘Sponsored by ...’) 7
  8. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Ranking factors Search result ranking factors in e-commerce search - Topicality - identify the product (type) that the user is searching for (‘laptop’ vs ‘laptop backpack’) - User’s relevance criteria (e-commerce/non-ecommerce) - Seller’s interests (maximise profit) - Personalisation & individualisation 8
  9. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Ranking factors Search result ranking factors in e-commerce search - Topicality - identify the product (type) that the user is searching for (‘laptop’ vs ‘laptop backpack’) - User’s relevance criteria (e-commerce/non-ecommerce) - Seller’s interests (maximise profit) - Personalisation & individualisation I will focus on topicality for the rest of my talk 9
  10. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Standard scoring models Standard scoring models evolved with enterprise search/general web search in mind: Typically - Long documents - unstructured/semi-structured - mixture of many, often abstract topics 10
  11. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Standard scoring models Standard scoring models evolved with enterprise search/general web search in mind: Typically - Long documents - unstructured/semi-structured - mixture of many, often abstract topics Compare with e-commerce search: Typically - Short documents - Fields - About a single, concrete thing (‘proxy’) 11
  12. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Standard scoring models Often based on language model that tries to predict the query likelihood given document/index term distributions: Score = f(tf, df) (See tf*idf, BM25(F)) 12
  13. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Standard scoring models Score = f(tf, df) Both, tf and df have problems in e-commerce search: - Unclear - often adverse - interaction of tf and df with fields - tf often equals 1 13
  14. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Standard scoring models Score = f(tf, df) Both, tf and df have problems in e-commerce search: - Unclear - often adverse - interaction of tf and df with fields - tf often equals 1 - doc length normalisation of tf often doesn’t work: - Acer Aspire E5-523-962Z - Laptop 2.9GHz A9-9410 15.6" 1366 x 768pixels Black 14
  15. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Standard scoring models Score = f(tf, df) Both, tf and df have problems in e-commerce search: - Unclear - often adverse - interaction of tf and df with fields - tf often equals 1 - doc length normalisation of tf often doesn’t work: - Acer Aspire E5-523-962Z - Laptop 2.9GHz A9-9410 15.6" 1366 x 768pixels Black - Laptop 15
  16. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Standard scoring models Score = f(tf, df) Both, tf and df have problems in e-commerce search: - Unclear - often adverse - interaction of tf and df with fields - tf often equals 1 - doc length normalisation of tf often doesn’t work - Counter-intuitive: - If two documents describe a laptop, they should both have the same topicality score regardless of the distribution of the terms in their description 16
  17. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) E-commerce scoring models Few scoring models were designed specifically for e-commerce search Predict product type and product properties from indexed product data and from the query and match at query time - SEMKNOX search engine (based on ontology) - Product type prediction from query at Amazon (D. Sorokina, E. Cantú-Paz, The Joy of Ranking Products, SIGIR ‘16, 2016) => Score tends to become binary (match vs no match) - Great intuition (a laptop shouldn’t be more ‘laptopish’ than the other) - Less noisy input for combination with other ranking factors 17
  18. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is worth a thousand words Query: laptop 18
  19. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is worth a thousand words Query: laptop 19
  20. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is worth a thousand words Query: laptop 20 Laptop Laptop backpack
  21. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is worth a thousand words Product pictures fit the ‘proxy’ metaphor nicely - they visually represent the real-world product that the document stands for Image recognition needed to explore product pictures for search -> model product type (and properties) Image recognition already being explored for e-commerce search: - nyris.io: known-item search - cerebel.io and Han Xiao, Zalando research (https://bit.ly/2EdQwtc): joint visual/textual search model 21
  22. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is worth a thousand words Can image recognition be used for search in a simpler way? 22
  23. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is worth a thousand words Can image recognition be used for search in a simpler way? Maybe just for scoring? 23
  24. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Image-based relevance scoring Inception 3 Image recognition (Tensorflow) Acer Aspire E5-523-962Z - Laptop 2.9GHz A9-9410 15.6" 1366 x 768pixels Black Recognize image Output vector (Softmax): x000: 0.00145 x001: 0.00030 ... x711: 0.79200 (laptop) ... x999: 0.00801 Acer Aspire E5-523-962Z - Laptop 2.9GHz A9-9410 15.6" 1366 x 768pixels Black Image recognition output vector [...] Enrich documents with image recognition output vectors during indexing 24
  25. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Intuition for scoring Likelihood of query ‘notebook’ in vector subspaces + + + + + + + - - Space of indexed Inception output vectors 25
  26. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Intuition for scoring Likelihood of query ‘notebook’ in vector subspaces + + + + + + + - - Higher score for query ‘notebook’ for documents having these images (5/5 vs 2/4) Space of indexed Inception output vectors 26
  27. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Towards a scoring formula Score ~ Likelihood of query given an image recognition vector subspace - Likelihood could be estimated but would assign too high a score to specific product subtypes (such as ‘running shoes’ for query ‘shoes’) - Better: Score ~ Jaccard similarity(products in vector subspace, products that match the query) 27
  28. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Towards a scoring formula Defining vector subspaces - Split space by random hyperplanes -> Random Projection Tree - Use more than one tree to reduce impact of hyperplanes that run through a group of closely related images -> Random Projection Forest - Per document: index few random projections instead of high-dimensional vector 28
  29. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Using random projection forests 29
  30. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Using random projection forests 30
  31. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Using random projection forests 31
  32. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Using random projection forests V1 => “11” (or “3”) V2 => “11” (or “3”) V3 => “00” (or “0”) V4 => “01” (or “1”) 32 Great video: Maciej Kula - Speeding up search with locality sensitive hashing: https://www.youtube.com/watch?v=NtA KQIrIU7w
  33. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Demo Solr plugin demo Many thanks to Profitmax (http://testit.de & http://preisvergleich.ch) for letting me use their product data for this demo 33
  34. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Using random projection forests A forest of 16 trees à 24 hyperplanes in Solr. We can work with fewer trees and hyperplanes at query time (for example, use p_tree_2:010* to query 3 hyperplanes in tree p_tree_2) 34
  35. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Search quality comparison Experiment: - Solr plugin to implement scoring based on image recognition - Index product data - Calculate search quality metrics for 100 queries, based on judgments derived from live traffic - Compare with other scoring algorithms A great ‘Thank you’ to otto.de for letting me use their product data and search judgment data! 35
  36. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Search quality comparison 36
  37. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Search quality comparison Scoring based on image recognition - Implemented using a random projection forest of 16 trees à 5 hyperplanes - Scored by sum of Jaccard Similarities between documents in vector subspaces and documents that match category query tokens only - no additional tf*idf scoring => Image-recognition based scoring on a par with best language model based scoring in experiment 37
  38. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Further improvements Future improvements/experiments: - Use language model based scoring as tie-breaker for documents that yield the same score based on image recognition - Combine with Jaccard Similarity of further query fields (beyond category) - Retrain image recognition for product properties, combine with model for product types - Tag document ‘offline’: weigh document terms using the same intuition (= term likelihood given the image recognition vector) 38
  39. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is a worth a thousand words W. Di, N. Sundaresan, R. Piramuthu, A. Bhardwaj: Is a Picture Really Worth a Thousand Words? - On the Role of Images in E-commerce. WSDM ‘14. 2014 39
  40. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) A picture is a worth a thousand words W. Di, N. Sundaresan, R. Piramuthu, A. Bhardwaj: Is a Picture Really Worth a Thousand Words? - On the Role of Images in E-commerce. WSDM ‘14. 2014 It’s at least worth a language model! ;-) 40
  41. A picture is worth a thousand words - relevance scoring based on product data, Haystack, 11 April 2018, René Kriegler (@renekrie) Thank you! http://www.rene-kriegler.com @renekrie Product images taken from Icecat open catalogue (icecat.biz) and preisvergleich.ch product data 41