Successfully reported this slideshow.
Your SlideShare is downloading. ×

Learning to Rank with Apache Solr and Fusion

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 21 Ad

Learning to Rank with Apache Solr and Fusion

Download to read offline

When faced with complex queries, datasets, or user behavior, basic search algorithms aren’t always enough to render the results we need. Learning to Rank (LTR) is a machine learning technique in Apache Solr for improving search results based on user behavior.

Presented by Andy Liu, Senior Data Scientist, Lucidworks and Trey Grainger, Chief Algorithms Officer

When faced with complex queries, datasets, or user behavior, basic search algorithms aren’t always enough to render the results we need. Learning to Rank (LTR) is a machine learning technique in Apache Solr for improving search results based on user behavior.

Presented by Andy Liu, Senior Data Scientist, Lucidworks and Trey Grainger, Chief Algorithms Officer

Advertisement
Advertisement

More Related Content

Similar to Learning to Rank with Apache Solr and Fusion (20)

More from Lucidworks (20)

Advertisement

Recently uploaded (20)

Learning to Rank with Apache Solr and Fusion

  1. 1. Learning To Rank with Apache Solr and Fusion Trey Grainger Chief Algorithms Officer Andy Liu Senior Data Scientist
  2. 2. The Problem: Classic Similarity Keyword search isn’t always enough for relevance
  3. 3. The Problem User searches for “outdoor rock speaker” Should see this:Sees this:
  4. 4. The Problem • Improving search relevance is hard, • TF-IDF and BM25 are good for text-keyword but what about other models of relevance? • Text matching is sometimes not the best solution • Users don’t always say what they mean
  5. 5. The Solution: Fusion 4 Signals + Solr 7 LTR
  6. 6. The Solution : Learning to Rank Overview • Learning to rank lets you pick “features” of a document that “matter” and teach the machine how to rank a set of items. • One possible source of ordering is user behavior (i.e. the only clicks were on the speaker shaped like a rock) • Solr provides a Learning to Rank implementation. • Fusion provides a way of capturing user behavior through signals.
  7. 7. The Solution: Learning to Rank Overview
  8. 8. The Solution: Learning to Rank Overview
  9. 9. The Solution: Fusion Signals Overview
  10. 10. The Solution • Define features (relevancy factors) • Derive Ground Truth using Fusion’s signals • Use Solr’s Learning to Rank implementation
  11. 11. Some notes • Fusion’s normal click boosting is an alternative and pretty good • It is possible to use them together or one where the other doesn’t work • Do other more simple things first, learning to rank without an adequate schema won’t accomplish much.
  12. 12. Some notes • Using click signals for ground truth • Pros: • Voluminous • Cheap • Reflects a captive user’s intent (especially when supplemented with purchase, add to cart events) • Tacitly, implicitly labeled data the key to an OOTB “self-learning” system • Cons • Noisy • Potential for reinforcing existing ranking
  13. 13. Putting it together
  14. 14. Building an LTR Pipeline
  15. 15. …but is it better? • Models compared: • Solr Out-of-the-box BM25 ranking using textual features only • Logistic Regression using all features except the signals feature • Logistic Regression using all features
  16. 16. Why is it better? • Summary of Benefits: • LTR offers automated relevancy tuning • Using Fusion to implement LTR greatly reduces the time and complexity required to train and deploy LTR models in production • Leveraging Fusion’s signals as features in an LTR model offers an easy way of boosting search relevance performance beyond what is possible using textual features alone
  17. 17. A/B and experiments • Do this carefully. • A/B testing is the safest way to make sure you don’t ruin different user experiences. • Stay tuned for a future webinar on Experiments and A/B testing
  18. 18. Where to learn more? • Grab the technical paper (with step by step instructions): https://lucidworks.com/ebook/learning-to-rank/ • Grab the code: https://github.com/lucidworks/fusion-ltr- webinar#fusionsolr-setup
  19. 19. Thank you
  20. 20. Register by Sep 6 to save $200 SEPTEMBER 9-12, 2019 WASHINGTON DC Check out the site here: https:/ / activate-conf.com/ JOIN ANDY AND TREY AT ACTIVATE • Productionizing Python ML Models Using Fusion 5, Sanket Shahane, Andy Liu • Natural Language Search with Knowledge Graphs, Trey Grainger • Closing Keynote: The Next Generation of AI-powered Search, Trey Grainger AI, ML & DATA SCIENCE TRACK • Supporting Query Tagging/Suggestion in Fusion 4.2, Uber • Building a Health QA Chatbot with Solr, Healthwise Incorporated • Tackling a “Small Data” Search Challenge at Airbnb Experiences, Airbnb • Using Deep Learning and Customized Solr Components to Improve Search Relevancy at Target, Target THE SEARCH AND AI CONFERENCE SEPTEMBER 9- 12,2019 WASHINGTON DC Check out the site here: https://activate-conf.com/ Register by Sep 6 to save $200
  21. 21. LIVE Q&A: Enter your questions in the chat box now for Trey to answer live

×