An Adaptive Proportional Value-per-Click Agent for Bidding in Ad Auctions

803 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
803
On SlideShare
0
From Embeds
0
Number of Embeds
12
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

An Adaptive Proportional Value-per-Click Agent for Bidding in Ad Auctions

  1. 1. An Adaptive Proportional Value-per- Click Agent for Bidding in Ad Auctions Trading Agent Design and Analysis Workshop 2011 Kyriakos C. Chatzidimitriou AUTH/CERTH Lampros C. Stavrogiannis Univ. of Southampton Andreas L. Symeonidis AUTH/CERTH Pericles A. Mitkas AUTH/CERTH
  2. 2. Introduction• Basic idea: working paper of Dr. Yevgeniy Vorobeychik regarding QuakTAC 2009 entry• Since this initial work, we have: – Conducted more Game-Theoretic experiments – Improved conversions estimation – Improved user distribution estimation – Included an adaptive component• Ended up with (more or less) the same: “Ultimate Answer to the Ultimate Question of Life, The Universe, and Everything” TAC Ad Auctions Game 0.3TADA@IJCAI 2011 Mertacor 2
  3. 3. Basic Strategy: VPC D q q bid d 1 a vd 1 ^ A q q q ˆ v Pr { conversion | click } E [ revenue | conversion ] ^ ^ C | focused }( Iˆ d q q qPr { conversion | click } focusedPer centage Pr { conversion 1 ) B TADA@IJCAI 2011 Mertacor 3
  4. 4. A) Expected Revenue• Solely depends on Manufacturer’s Specialty (MS) (USP (3 MSB )) / 3 MS not defined in q USP (1 MSB ) MS matched in q USP MS not matched in qTADA@IJCAI 2011 Mertacor 4
  5. 5. B) Focused Percentage• Monte Carlo Simulations• First Method (Vorobeychik) – focusedPercentagequery = conversionsquery / [clicksquery * Pr(conversionquery )] – Average over query class (F0,F1,F2)• Second Method 2011 – Use server source files – MC states (NS, IS, F0, F1, F2, T) per product (x9) – focusedPercentagequery = Fiquery / (Fiquery + ISquery)TADA@IJCAI 2011 Mertacor 5
  6. 6. Graph for query (pg, null)TADA@IJCAI 2011 Mertacor 6
  7. 7. C) Id Estimation cap Id 1 g (cd 3 cd 2 cd 1 ˆ cd ˆ cd 1 C )• kNN – Inspired by periodic conversions behavior – Time series matching using Euclidean Distance as a similarity criterion – k = 5, t = 5, N = 600• Heuristic Baseline – Underestimate for bidding higher cd = (cd-1 +cd-2 +cd-3 )/4• Aggregate – cd = (kNN+Baseline)/2 – cd+1 = ((kNN+Baseline)/2)/2TADA@IJCAI 2011 Mertacor 7
  8. 8. kNN exampleTADA@IJCAI 2011 Mertacor 8
  9. 9. No ad No display conversions Cyclic behavior High Low bid conversion prob. • 5-day long pulses • Pulse Height & Width related to factors like user distribution at the Low time, competition High VPC • Large peaks in daily profits come from VPC “catching the wave” Low conversion High bid prob. High ConversionsTADA@IJCAI 2011 Mertacor ranking 9
  10. 10. Rest of the strategy• Budget unconstrained• Hard-coded ad selection strategy – F0 => generic – F2 => if user preference matched => targeted – F1 => if one of the preferences is matched => targeted, else genericTADA@IJCAI 2011 Mertacor 10
  11. 11. Simulation-based Game Theoretical Analysis• One-shot Bayesian game• Myopic linear strategies b = α ∙ vpc -> find optimal shading, α• Iterative best response to find a symmetric Bayes-Nash equilibrium• Most profitable single deviation from a homogeneous set of opponents until self-play is best response -> BNETADA@IJCAI 2011 Mertacor 11
  12. 12. D) alpha• Vorobeychik – “a = 0.2, 0.3 more robust to aggressive opponents” – The previous best values found a=0.1, 0.2 (2009) not profitable in 2010 platform• We have re-run the algorithm under the 2010 specs – a=0.3 is the optimal value (1 -> 0.4 -> 0.3)TADA@IJCAI 2011 Mertacor 12
  13. 13. Simulation-based Game Theoretical Analysis• Instead of α -> (αF0 ,αF1, αF2) x (αCLOW, αCMED,αCHIGH)• Start from optimal α = 0.3, explore all possible deviations for each α, first for query levels then capacity levels• 0.3 seems to be optimal in all cases• Points in between do not yield different results (0.3 still the best)TADA@IJCAI 2011 Mertacor 13
  14. 14. Simulation-based Game Theoretical AnalysisTADA@IJCAI 2011 Mertacor 14
  15. 15. Simulation-based Game Theoretical AnalysisTADA@IJCAI 2011 Mertacor 15
  16. 16. Adaptive component• Problem Statement We want to capture the case where, based on the current environment (competition conditions), having a different α than 0.3, will yield a competitive advantage• GT analysis “a good starting point”• Model it as an associative k-armed bandit problem with optimistic initial values and e- greedy action selection strategyTADA@IJCAI 2011 Mertacor 16
  17. 17. State, Action, Reward• State – Quantized VPC (x11) – Capacity (x3) – Query Type (x3) – Manufacturer Specialty Bonus (x2) – Component Specialty Bonus (x2)• a = {0.28, 0.29, 0.3, 0.31, 0.32}• r = daily profitsTADA@IJCAI 2011 Mertacor 17
  18. 18. Experiment (1/2)• Self-play Agent Name Score – 210 games Mertacor-Std-1 53.042 – All capacities to 450 Mertacor-Std-2 52.763 (MEDIUM) Mertacor-kNN-1 52.673• The standard agent is Mertacor-kNN-2 52.703 unbeatable since it is created Mertacor-RL-1 52.270 that way Mertacor-RL-2 52.233 Mertacor-Full-1 51.673 Mertacor-Full-2 51.899TADA@IJCAI 2011 Mertacor 18
  19. 19. Experiment (2/2)• Mix-up things, include more Agent Name Score agents with different strategies Mertacor-kNN 53.223 – 250 games – All capacities to 450 (MEDIUM) Mertacor-Std 52.245 Schlemazl (2010) 51.975• Better estimation lead to better performance Mertacor-Full 51.796 Mertacor-RL 51.790• Adaptiveness is suited for even more complicated Epflagent (2010) 49.232 environments Tau (2010) 45.987 (capacity and strategy wise) Crocodile (2010) 45.858TADA@IJCAI 2011 Mertacor 19
  20. 20. 2011 Also tested/under development• Daily Campaign Budget Threshold algorithms – Estimation – Simulation• Particle Filtering for user state estimation – TacTexTADA@IJCAI 2011 Mertacor 20
  21. 21. Conclusions & Future Work• α = 0.3 is a very powerful conclusion/hard to beat• Better estimates for B) user state and C) Id could further improve performance• On-line learning still in very crude form – Not yet satisfied but seems a reasonable thing to do• Competition-wise: fitted-Q learning from data logsTADA@IJCAI 2011 Mertacor 21
  22. 22. Thank you for your attention Questions?

×