Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

0

Share

Download to read offline

Boosting Ad Revenue Using Reinforcement Learning (Robin Schuil Technology Stream)

Download to read offline

Lviv IT Arena is a conference specially designed for programmers, designers, developers, top managers, inverstors, entrepreneurs and startuppers. Annually it takes place at the beginning of October in Lviv at Arena Lviv stadium. In 2016 the conference gathered more than 1800 participants and over 100 speakers from companies like Microsoft, Philips, Twitter, UBER and IBM. More details about the conference at itarena.lviv.ua.

Related Books

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Boosting Ad Revenue Using Reinforcement Learning (Robin Schuil Technology Stream)

  1. 1. @schuilr 1
  2. 2. Case Study: Marktplaats.nl @schuilr 2
  3. 3. Marktplaats.nl •  Largest classifieds site in the Netherlands •  One of the most visited websites in NL •  Founded in 1999, acquired by eBay in 2004 •  Now headquarters to eBay Classifieds Group: 12 brands in 17 countries @schuilr 3
  4. 4. Facts & Figures •  1.3 million visitors / day –  desktop: 34%, mobile: 49%, tablet: 18% •  9 million live listings –  350,000 new items / day •  6 million unique search requests / day –  70 searches per second (average) @schuilr 4
  5. 5. Data & Trends @ Marktplaats
  6. 6. Seasonal trends @schuilr 6 0.00% 1.00% 2.00% 3.00% 4.00% 5.00% 6.00% 7.00% 8.00% 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 Vraag Week skibroek ski skipak snowboard Winter sports!
  7. 7. Seasonal trends @schuilr 7 Camping! 0.00% 0.50% 1.00% 1.50% 2.00% 2.50% 3.00% 3.50% 4.00% 4.50% 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 Vraag Week caravans campers vouwwagen
  8. 8. Seasonal trends @schuilr 8 0.00% 2.00% 4.00% 6.00% 8.00% 10.00% 12.00% 14.00% 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 Vraag Week sinterklaas kerst Saint Nicolas & Christmas!
  9. 9. Weather, temperature, etc. @schuilr 9 0" 5" 10" 15" 20" 25" 0.00%" 1.00%" 2.00%" 3.00%" 4.00%" 5.00%" 6.00%" 7.00%" 1" 3" 5" 7" 9" 11" 13" 15" 17" 19" 21" 23" 25" 27" 29" 31" 33" 35" 37" 39" 41" 43" 45" 47" 49" 51" Temperatuur) Vraag) Week) vliegengordijn" Temperatuur" Fly curtains!
  10. 10. Weather, temperature, etc. @schuilr 10 Heaters! 0" 5" 10" 15" 20" 25"0.00%" 0.50%" 1.00%" 1.50%" 2.00%" 2.50%" 3.00%" 3.50%" 4.00%" 1" 3" 5" 7" 9" 11" 13" 15" 17" 19" 21" 23" 25" 27" 29" 31" 33" 35" 37" 39" 41" 43" 45" 47" 49" 51" Temperatuur) Vraag) Week) kachel" Temperatuur" Reversed
  11. 11. Special events @schuilr 11 0.00% 1.00% 2.00% 3.00% 4.00% 5.00% 6.00% 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 Vraag Week oranje Orange (“oranje”)! World Cup Football King’s Day
  12. 12. During a football game @schuilr 12 20:45& 20:48& 20:51& 20:54& 20:57& 21:00& 21:03& 21:06& 21:09& 21:12& 21:15& 21:18& 21:21& 21:24& 21:27& 21:30& 21:33& 21:36& 21:39& 21:42& 21:45& 21:48& 21:51& 21:54& 21:57& 22:00& 22:03& 22:06& 22:09& 22:12& 22:15& 22:18& 22:21& 22:24& 22:27& 22:30& 22:33& 22:36& 22:39& 22:42& 22:45& 22:48& 22:51& 22:54& 22:57& 23:00& 23:03& 23:06& 23:09& 23:12& 23:15& Last&Friday& This&Friday& Break Kick-off 1 - 0 1 - 1 1 - 2 1 - 3 1 - 4 1 - 5 End
  13. 13. “Juichpakken” 0.00%$ 5.00%$ 10.00%$ 15.00%$ 20.00%$ 25.00%$ 1$ 3$ 5$ 7$ 9$ 11$ 13$ 15$ 17$ 19$ 21$ 23$ 25$ 27$ 29$ 31$ 33$ 35$ 37$ 39$ 41$ 43$ 45$ 47$ 49$ 51$ Vraag% Week% roy$donders$ juichpak$
  14. 14. Exploiting trends @schuilr 14
  15. 15. “Nieuw & populair” @schuilr 15 •  “Nieuw & populair” = trending products •  Pay-per-click advertising model •  Advertisers bid for clicks, similar to Google Adwords •  Metric to optimize: Revenue Per Mille (RPM) = CTR * bid * 1,000
  16. 16. First (minimal) version •  Find top 100 “trending” keywords using Spark •  Randomly pick one of those keywords •  Display top 4 results for the selected keyword @schuilr 16
  17. 17. Can we do better? •  CTR and bid varies per keyword. Random selection gives average performance. •  Doesn’t consider the user’s personal preferences @schuilr 17
  18. 18. GLOBAL OPTIMIZATION PART I @schuilr 18
  19. 19. One armed bandit = slot machine Problem: How to pick between slot machines so that you maximize profit? @schuilr 19
  20. 20. Exploration – Exploitation •  Explore (learn)" Try out different candidates to learn how they perform over time •  Exploit (earn)" Take advantage of what you’ve learned to maximize payoff (your current best guess) @schuilr 20
  21. 21. Many different approaches •  Epsilon First •  Epsilon Greedy •  Upper Confidence Bound •  Thompson Sampling •  LinUCB @schuilr 21
  22. 22. Epsilon First Time Random Learn: 
 collect data for each candidate ( split testing, A/B testing ) Best Earn: 
 show the best performer @schuilr 22
  23. 23. Epsilon First •  Simple and intuitive •  Lots of tools available (VWO, Optimizely, …) •  Average reward until exploration is finished •  What if the best candidate is no longer the best? @schuilr 23
  24. 24. Epsilon Greedy Best (90%) Time Random (10%) Continuous exploration @schuilr 24
  25. 25. Epsilon Greedy •  Very simple to implement and surprisingly effective •  Can deal with nonstationary problems •  How to determine the optimal value for ε? @schuilr 25
  26. 26. Upper Confidence Bound Basic idea: •  Calculate mean and a measure of uncertainty (variance) for each candidate •  Pick current best performer based on mean + uncertainty bonus @schuilr 26
  27. 27. Measuring uncertainty Observed mean: 0.50 95% certain that true mean ≤ 0.76 Uncertainty bonus: 0.26 @schuilr 27
  28. 28. More data = less uncertainty 95% certain that true mean ≤ 0.63 Uncertainty bonus: 0.13 @schuilr 28
  29. 29. Mean + uncertainty bonus Upper Confidence Bound A B C Es)mated reward Pick “A”! @schuilr 29
  30. 30. Upper Confidence Bound •  Selecting “A” reduces uncertainty •  Candidate “C” now has the highest score A B C Es)mated reward Pick “C”! @schuilr 30
  31. 31. Upper Confidence Bound •  Uses variance measure to automatically balance exploration with exploitation •  Deterministic; requires online learning (not suited for small-batch mode) @schuilr 31
  32. 32. Thompson Sampling Basic idea: •  The number of pulls for a given lever should match its actual probability of being the optimal lever •  Sample from the posterior for the mean of each lever: p(λ|X) = Gamma(conv + prior_conv, impr + prior_impr) @schuilr 32
  33. 33. Few conversions Candidate Conversions Impressions Chance of being winner A (3.9%) 11 282 42% B (3.3%) 2 61 39% C (2.8%) 4 143 19% @schuilr 33
  34. 34. More conversions Candidate Conversions Impressions Chance of being winner A (3.9%) 93 2,382 82% B (3.3%) 66 2,011 13% C (2.8%) 31 1,093 5% @schuilr 34
  35. 35. Many conversions Candidate Conversions Impressions Chance of being winner A (3.9%) 892 22,882 97% B (3.3%) 174 5,261 2% C (2.8%) 66 2,343 1% @schuilr 35
  36. 36. Lots of conversions Candidate Conversions Impressions Chance of being winner A (3.9%) 5,621 144,132 > 99% B (3.3%) 256 7,761 < 1% C (2.8%) 101 3,593 < 1% @schuilr 36
  37. 37. Thompson Sampling •  Weighted random sampling •  Works well in small-batch mode •  Doesn’t consider context (e.g. user’s personal preferences) @schuilr 37
  38. 38. PERSONALIZATION PART II @schuilr 38
  39. 39. LinUCB Basic idea: •  Define a “context” of information of the user •  Fit a per-candidate logistic regression model •  Applies the concept of Upper Confidence Bound (UCB) –  mean + uncertainty bonus @schuilr 39
  40. 40. Context •  Gender •  Recently viewed categories •  Current date •  Weather forecast •  … Principal Component Analysis (PCA) to reduce sparseness and computation complexity @schuilr 40
  41. 41. LinUCB Mean + uncertainty bonus: μα(t) + σα(t) @schuilr 41
  42. 42. Pruning •  Periodically remove weakest performers •  Replace with new, unexplored “trending keywords” •  Rinse and repeat @schuilr 42
  43. 43. Results @schuilr 43 Random Optimized × 2.8!
  44. 44. Endless possibilities •  News homepage •  Online advertising •  Deciding which thumbnail to show on the SERP •  Etc, etc ... @schuilr 44
  45. 45. Reading List “Bandit Algorithms for Website Optimization” http://bit.ly/bandits-book “Reinforcement Learning” http://bit.ly/rl-book @schuilr 45
  46. 46. @SCHUILR" LINKEDIN.COM/IN/ROBINSCHUIL Дякую @schuilr 46
  47. 47. References •  https://en.wikipedia.org/wiki/Multi-armed_bandit •  http://shop.oreilly.com/product/0636920027393.do •  https://webdocs.cs.ualberta.ca/~sutton/book/the-book.html •  http://www.slideshare.net/chucheng/efficient-approximate-thompson-sampling-for-search-query-recommendation •  http://www.slideshare.net/iliasfl/multiarmed-bandits-intro-examples-and-tricks •  http://www.slideshare.net/mgershoff/conductrics-bandit-basicsemetrics1016 •  http://www.slideshare.net/MarkusOjala1/multi-armed-bandits-and-optimized-online-marketing-54679491 @schuilr 47

Lviv IT Arena is a conference specially designed for programmers, designers, developers, top managers, inverstors, entrepreneurs and startuppers. Annually it takes place at the beginning of October in Lviv at Arena Lviv stadium. In 2016 the conference gathered more than 1800 participants and over 100 speakers from companies like Microsoft, Philips, Twitter, UBER and IBM. More details about the conference at itarena.lviv.ua.

Views

Total views

1,241

On Slideshare

0

From embeds

0

Number of embeds

4

Actions

Downloads

50

Shares

0

Comments

0

Likes

0

×