Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Yelp Ad Targeting at Scale with Apache Spark with Inaz Alaei-Novin and Joe Malicki

From training billions of ad impressions to scaling gradient boosted trees with more than three million nodes, Ad Targeting at Yelp uses Apache Spark in many stages of its large-scale machine learning pipeline.

This session will explore examples of how Yelp employed and tweaked Spark to support big data feature engineering, visualizations and machine learning model training, evaluation and diagnostics. You’ll also hear about the challenges in building and deploying such a large-scale intelligent system in a production environment.

  • Login to see the comments

Yelp Ad Targeting at Scale with Apache Spark with Inaz Alaei-Novin and Joe Malicki

  1. 1. Joseph Malicki, Inaz Alaei-Novin Yelp Ad Targeting at Scale with Apache Spark 1
  2. 2. 2 Background - Yelp Ad Targeting Intro Model Training Tools Deployment to Production Wrap-up
  3. 3. About us • Joseph Malicki, Inaz Alaei-Novin • Data mining engineers at • Ad delivery team 3
  4. 4. Yelp’s Mission Connecting people with great local businesses. 4
  5. 5. Yelp Stats As of Q1 2017 127M
 Monthly Unique
 Desktop Users 76% of
 Searches via
 Mobile App 99M
 Monthly Unique
 Mobile Users 5
  6. 6. 6 Background - Yelp Ad Targeting Intro Model Training Tools Deployment to Production Wrap-up
  7. 7. Yelp Ads 7
  8. 8. Yelp Ad Targeting • Majority of Yelp ads are cost-per-click - Yelp only gets paid if user clicks on an ad • Native advertisements - Advertisers and content within Yelp platform 8
  9. 9. Cost-per-Click Ad Auction Maximize expected revenue: 1. Order by advertiser bid × predicted
 click-through-rate (pCTR) 2. Pay second price Expected[Revenue] = Bid * Expected[CTR]
 
 Because of multiplication, predicted CTR
 must be well-calibrated, not only well-ordered 9
  10. 10. Yelp Ad Targeting Ad delivery Context • Each request different • Context matters • Location • Search query • User attributes • etc. How to Generalize? Use machine learning to estimate CTR and show relevant ads 10
  11. 11. 11 Background - Yelp Ad Targeting Intro Model Training Tools Deployment to Production Wrap-up
  12. 12. CTR prediction system overview Logs Online Offline Ad delivery Training pipeline Model 12
  13. 13. Offline Training at Yelp SamplingAd event Logs Feature Extraction Model Training Evaluating New Features 13
  14. 14. Ad Event Logs Sampling Feature Extraction Model Training Evaluating New Features JSON 14
  15. 15. Sampling Feature Extraction Model Training Evaluating New Features JSON Train Samples 15 Test Samples
  16. 16. Feature Extraction Model Training Evaluating New Features JSON Sampling is 4X faster with Spark! Sampling 16 Train Samples Test Samples
  17. 17. Feature Extraction Model Training Evaluating New Features JSON Sampling 17 Train Samples Test Samples
  18. 18. Feature Extraction Model Training Evaluating New Features JSON Ad event id Number of features Label Feature indices and values 18 Train Samples Test Samples
  19. 19. Model Training Evaluating New Features JSON Model Training 19 Train Samples Test Samples
  20. 20. Evaluating New Features JSON Model Model Training 20 Train Samples Test Samples
  21. 21. New Features JSON compare MXE with status quo and visualizations Evaluation 21 Train Samples Test Samples Model
  22. 22. Feature contribution in a Model 22 Standard deviation * model coefficient Feature contribution (i) = σiωi
  23. 23. Compare Feature Importance in Multiple Models 23 Feature contribution (i) = μiωi Feature mean * model coefficient Use colStats from pyspark.mllib.stat.Statistics to compute column summary statistics
  24. 24. Compare Feature Contributions in Models 24 Compare feature contribution in 2 models: - How much would status quo MXE change if we change the coefficient of one feature from status quo to challenger?
  25. 25. New Features JSON New Features 25 Train Samples Test Samples Model
  26. 26. New Features JSON 26 Train Samples Test Samples Model
  27. 27. Visualizations 27 Use RDD’s Histogram method and some RDD mappings to generate the plots
  28. 28. Pair of Features against oCTR or Ad events Visualizations 28 Use RDD’s Histogram method and some RDD mappings to generate the plots
  29. 29. Training Pipeline JSON 29 Train Samples Test Samples Model
  30. 30. 30 Background - Yelp Ad Targeting Intro Model Training Tools Deployment to Production Wrap-up
  31. 31. Spark related tools • Zeppelin Notebook • mrjob 31
  32. 32. Zeppelin Notebook •Web-based notebook •Interactive data analytics •Supports multiple languages •Supports Spark •At Yelp we use it for: ◦ Ad-hoc analysis ◦ Testing new training algorithms ◦ Debugging
 32
  33. 33. mrjob • One of Yelp’s contribution to open source! • Lets you Write multi-step MapReduce jobs in Python • Test on your local machine • Run on a Hadoop cluster • Run in the cloud using EMR • Run in the cloud using Google Cloud Dataproc • Easily run Spark jobs on EMR or your own Hadoop cluster 33
  34. 34. 34 Background - Yelp Ad Targeting Intro Model Training Tools Deployment to Production Wrap-up
  35. 35. Production concerns Offline Batch
 • Overnight or developer-initiated jobs • Millions to billions of datapoints • Batch-oriented (hours) • Apache Spark
 
 
 
 Online Ad Serving
 • User hits button on app, needs quick response • Smaller number of locally and contextually relevant candidates • Real-time (milliseconds) • Java servlet
 
 
 
 Shared code
 (libraries) 35
  36. 36. Monitoring • If CTR prediction model stops being accurate, could lead to loss of revenue • How do we know models are working properly? • Need to check model predictions are accurate over time 36
  37. 37. Monitoring • Large batch jobs check actual user ad click- through-rate against predicted CTR • Model accuracy far more sensitive than overall metrics: traffic mix is accounted for • Spark streaming allows real-time alerts - A practical approach to building a streaming processing pipeline for an online advertising platform - Spark Summit 2017 37
  38. 38. Monitoring - Examples • Misspelled header in API call refactor • Change in HTTPS caching behavior affects CTR ProblemNormal 38
  39. 39. Monitoring - Calibration Plot • Recall ad auction orders by advertiser bid × predicted click-through-rate (pCTR) • Because of multiplication, predicted probabilities need to be well-calibrated
 • Goal: 39 P clicked | CTˆR = y( )= y
  40. 40. Monitoring - Calibration Plot Predicted CTR Fractionofads 40 Observed-PredictedCTR
  41. 41. Monitoring - Calibration Plot • Logistic regression loss is a proper scoring rule - Generates models that are well-calibrated on average • Feature engineering problems can cause poor calibration • Probability distribution drifting over time will cause loss of calibration - e.g. changes to user interface affecting behavior 41
  42. 42. 42 Background - Yelp Ad Targeting Intro Model Training Tools and Visualizations Deployment to Production Wrap-up
  43. 43. Spark at Yelp • Spark increasingly used throughout Yelp • Streaming • Iteration • Easy specification of job flows • Want to work with Spark? We’re hiring - stop by Yelp booth in exhibition area, until 4:30pm 43
  44. 44. www.yelp.com/careers/ We're Hiring! 44
  45. 45. @YelpEngineering fb.com/YelpEngineers engineeringblog.yelp.com github.com/yelp 45
  46. 46. Thank You. Questions? 46

×