Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Yelp Search and Learning to Rank
Umesh Dangat
Yelp’s Mission
Connecting people with great
local businesses.
Yelp by the Numbers
● Our users have written more than 177
million reviews by the end of Q4 2018
● Monthly average of uniq...
Yelp Core Search
Yelp core Search SLAs
● Search query latencies: two digit
milliseconds
● Real time indexing: no more than a
few seconds of...
Search Backend
Yelp Custom Scoring
● Documents are recalled/retrieved as
per the “filters” in the query
● Each document is scored using a...
ML and search scoring
● Search relevance engineers train models “offline”
○ Iterate on models using data for logged featur...
Scoring Before LTR
Issues with this
approach
● Scoring logic API leak
○ No single point of contained relevance logic
○ Hard to understand and...
Issues with this
approach
● Issues with code pushes
○ Not always rollback safe
○ Easy to push code with missing/new featur...
Looking for an
alternative
● Solution should scale to more than one specific
team/use case
● Decouple model and feature tr...
This plugin:
● Allows you to store features (Elasticsearch query templates) in Elasticsearch
● Logs features scores (relev...
Looking for an
alternative
● Solution should scale to more than one specific
team/use case (yes)
● Decouple model and feat...
Scoring with
LTR
Learning to Rank
Plugin and Yelp
● Yelp had to make some changes to the then
existing LTR plugin functionality in order to...
Learning to Rank
Plugin and Yelp
● Selective feature selection
Linear model might for
these features might
resemble:
{
"ti...
Learning to Rank
Plugin and Yelp
● Selective feature selection
Learning to Rank
Plugin and Yelp
● Passing feature vector between LTR and native java plugins so
that features do not have...
Learning to Rank
Plugin and Yelp
LTR @Yelp today
● Enabled Yelp search to do tiered scoring
● Newer ranking use cases at Yelp solved using
LTR
○ E.g. painl...
LTR @Yelp challenges
● Decouple model and feature training from online
deployment (maybe)
● Potential to solve for NeuralN...
Thank you
● Doug Turnbull for being accessible to answer our
questions!
● David Causse for all the code reviews!
www.yelp.com/careers/
We're Hiring!
@YelpEngineering
fb.com/YelpEngineers
engineeringblog.yelp.com
github.com/yelp
Questions?
Thank you.
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
What to Upload to SlideShare
Next
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

Share

Haystack 2019 - Evolution of Yelp search to a generalized ranking platform - Umesh Dangat

Download to read offline

Elasticsearch forms the backbone of Yelp's core search.

The Learning to Rank elasticsearch plugin is one of the key tools that has transformed the Yelp Search team from serving linear ranking models only on the search page to powering a business ranking platform that serves all business recommendation applications across Yelp.

This talk will detail how Yelp's search engineers enhanced LTR plugin such that it would not only solve Yelp's current search needs but also enable future ranking use cases at Yelp.

Haystack 2019 - Evolution of Yelp search to a generalized ranking platform - Umesh Dangat

  1. 1. Yelp Search and Learning to Rank Umesh Dangat
  2. 2. Yelp’s Mission Connecting people with great local businesses.
  3. 3. Yelp by the Numbers ● Our users have written more than 177 million reviews by the end of Q4 2018 ● Monthly average of unique visitors who visited Yelp in Q4 2018 ○ 33 million via the Yelp app and ○ 69 million via mobile web ● Billions of queries served per year
  4. 4. Yelp Core Search
  5. 5. Yelp core Search SLAs ● Search query latencies: two digit milliseconds ● Real time indexing: no more than a few seconds of delay for indexed data to be searchable.
  6. 6. Search Backend
  7. 7. Yelp Custom Scoring ● Documents are recalled/retrieved as per the “filters” in the query ● Each document is scored using a heuristic ● Typical heuristics are ○ Document features ○ Query features ○ Some derivatives of the two above
  8. 8. ML and search scoring ● Search relevance engineers train models “offline” ○ Iterate on models using data for logged feature score components ● Serialized format of model is deployed “online” ○ Elasticsearch uses this model at query time to rank documents
  9. 9. Scoring Before LTR
  10. 10. Issues with this approach ● Scoring logic API leak ○ No single point of contained relevance logic ○ Hard to understand and iterate
  11. 11. Issues with this approach ● Issues with code pushes ○ Not always rollback safe ○ Easy to push code with missing/new features ● Difficult to extend for other types of models ● Size of queries gets longer and ser-de becomes more expensive ● More teams at Yelp wanting to solve similar ranking problems backed by elasticsearch
  12. 12. Looking for an alternative ● Solution should scale to more than one specific team/use case ● Decouple model and feature training from online deployment ● Should allow for iterations without Elasticsearch cluster restarts. ● Hosted model server aka scoring outside elasticsearch was ruled out due to latency constraints ● Allow for tiered scoring
  13. 13. This plugin: ● Allows you to store features (Elasticsearch query templates) in Elasticsearch ● Logs features scores (relevance scores) to create a training set for offline model development ● Stores linear, xgboost, or ranklib ranking models in Elasticsearch that use features you've stored ● Ranks search results using a stored model Elasticsearch Learning to Rank Plugin
  14. 14. Looking for an alternative ● Solution should scale to more than one specific team/use case (yes) ● Decouple model and feature training from online deployment (maybe) ● Should allow for iterations without Elasticsearch cluster restarts. (yes generally speaking) ● Hosted model server aka scoring outside elasticsearch was ruled out due to latency constraints (NA) ● Allow for tiered scoring (yes)
  15. 15. Scoring with LTR
  16. 16. Learning to Rank Plugin and Yelp ● Yelp had to make some changes to the then existing LTR plugin functionality in order to make it workable for our use cases. ● Let us look at a couple of the most important ones.
  17. 17. Learning to Rank Plugin and Yelp ● Selective feature selection Linear model might for these features might resemble: { "title_query" : 0.3, "user_rating" : 0.5 }
  18. 18. Learning to Rank Plugin and Yelp ● Selective feature selection
  19. 19. Learning to Rank Plugin and Yelp ● Passing feature vector between LTR and native java plugins so that features do not have to be recomputed ● Consider a scenario where you have one base feature potentially used as a seed value in multiple derived features ● Example ○ Base feature: document field value look up e.g. rating ○ Derived features: derived computation ■ feature A: log(rating) + log(word_score) ■ feature B: log(rating) + log(popularity) ● In the above example we don’t want to end up re-computing the rating multiple times.
  20. 20. Learning to Rank Plugin and Yelp
  21. 21. LTR @Yelp today ● Enabled Yelp search to do tiered scoring ● Newer ranking use cases at Yelp solved using LTR ○ E.g. painless scripts, ES query features, custom native plugins.
  22. 22. LTR @Yelp challenges ● Decouple model and feature training from online deployment (maybe) ● Potential to solve for NeuralNetwork, vector embeddings.
  23. 23. Thank you ● Doug Turnbull for being accessible to answer our questions! ● David Causse for all the code reviews!
  24. 24. www.yelp.com/careers/ We're Hiring!
  25. 25. @YelpEngineering fb.com/YelpEngineers engineeringblog.yelp.com github.com/yelp
  26. 26. Questions?
  27. 27. Thank you.
  • ChunTingKuo

    Jan. 30, 2020
  • tm65

    Aug. 5, 2019

Elasticsearch forms the backbone of Yelp's core search. The Learning to Rank elasticsearch plugin is one of the key tools that has transformed the Yelp Search team from serving linear ranking models only on the search page to powering a business ranking platform that serves all business recommendation applications across Yelp. This talk will detail how Yelp's search engineers enhanced LTR plugin such that it would not only solve Yelp's current search needs but also enable future ranking use cases at Yelp.

Views

Total views

286

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

15

Shares

0

Comments

0

Likes

2

×