Your SlideShare is downloading. ×
An Adaptive Proportional Value-per-Click Agent for Bidding in Ad Auctions
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

An Adaptive Proportional Value-per-Click Agent for Bidding in Ad Auctions

611
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
611
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. An Adaptive Proportional Value-per- Click Agent for Bidding in Ad Auctions Trading Agent Design and Analysis Workshop 2011 Kyriakos C. Chatzidimitriou AUTH/CERTH Lampros C. Stavrogiannis Univ. of Southampton Andreas L. Symeonidis AUTH/CERTH Pericles A. Mitkas AUTH/CERTH
  • 2. Introduction• Basic idea: working paper of Dr. Yevgeniy Vorobeychik regarding QuakTAC 2009 entry• Since this initial work, we have: – Conducted more Game-Theoretic experiments – Improved conversions estimation – Improved user distribution estimation – Included an adaptive component• Ended up with (more or less) the same: “Ultimate Answer to the Ultimate Question of Life, The Universe, and Everything” TAC Ad Auctions Game 0.3TADA@IJCAI 2011 Mertacor 2
  • 3. Basic Strategy: VPC D q q bid d 1 a vd 1 ^ A q q q ˆ v Pr { conversion | click } E [ revenue | conversion ] ^ ^ C | focused }( Iˆ d q q qPr { conversion | click } focusedPer centage Pr { conversion 1 ) B TADA@IJCAI 2011 Mertacor 3
  • 4. A) Expected Revenue• Solely depends on Manufacturer’s Specialty (MS) (USP (3 MSB )) / 3 MS not defined in q USP (1 MSB ) MS matched in q USP MS not matched in qTADA@IJCAI 2011 Mertacor 4
  • 5. B) Focused Percentage• Monte Carlo Simulations• First Method (Vorobeychik) – focusedPercentagequery = conversionsquery / [clicksquery * Pr(conversionquery )] – Average over query class (F0,F1,F2)• Second Method 2011 – Use server source files – MC states (NS, IS, F0, F1, F2, T) per product (x9) – focusedPercentagequery = Fiquery / (Fiquery + ISquery)TADA@IJCAI 2011 Mertacor 5
  • 6. Graph for query (pg, null)TADA@IJCAI 2011 Mertacor 6
  • 7. C) Id Estimation cap Id 1 g (cd 3 cd 2 cd 1 ˆ cd ˆ cd 1 C )• kNN – Inspired by periodic conversions behavior – Time series matching using Euclidean Distance as a similarity criterion – k = 5, t = 5, N = 600• Heuristic Baseline – Underestimate for bidding higher cd = (cd-1 +cd-2 +cd-3 )/4• Aggregate – cd = (kNN+Baseline)/2 – cd+1 = ((kNN+Baseline)/2)/2TADA@IJCAI 2011 Mertacor 7
  • 8. kNN exampleTADA@IJCAI 2011 Mertacor 8
  • 9. No ad No display conversions Cyclic behavior High Low bid conversion prob. • 5-day long pulses • Pulse Height & Width related to factors like user distribution at the Low time, competition High VPC • Large peaks in daily profits come from VPC “catching the wave” Low conversion High bid prob. High ConversionsTADA@IJCAI 2011 Mertacor ranking 9
  • 10. Rest of the strategy• Budget unconstrained• Hard-coded ad selection strategy – F0 => generic – F2 => if user preference matched => targeted – F1 => if one of the preferences is matched => targeted, else genericTADA@IJCAI 2011 Mertacor 10
  • 11. Simulation-based Game Theoretical Analysis• One-shot Bayesian game• Myopic linear strategies b = α ∙ vpc -> find optimal shading, α• Iterative best response to find a symmetric Bayes-Nash equilibrium• Most profitable single deviation from a homogeneous set of opponents until self-play is best response -> BNETADA@IJCAI 2011 Mertacor 11
  • 12. D) alpha• Vorobeychik – “a = 0.2, 0.3 more robust to aggressive opponents” – The previous best values found a=0.1, 0.2 (2009) not profitable in 2010 platform• We have re-run the algorithm under the 2010 specs – a=0.3 is the optimal value (1 -> 0.4 -> 0.3)TADA@IJCAI 2011 Mertacor 12
  • 13. Simulation-based Game Theoretical Analysis• Instead of α -> (αF0 ,αF1, αF2) x (αCLOW, αCMED,αCHIGH)• Start from optimal α = 0.3, explore all possible deviations for each α, first for query levels then capacity levels• 0.3 seems to be optimal in all cases• Points in between do not yield different results (0.3 still the best)TADA@IJCAI 2011 Mertacor 13
  • 14. Simulation-based Game Theoretical AnalysisTADA@IJCAI 2011 Mertacor 14
  • 15. Simulation-based Game Theoretical AnalysisTADA@IJCAI 2011 Mertacor 15
  • 16. Adaptive component• Problem Statement We want to capture the case where, based on the current environment (competition conditions), having a different α than 0.3, will yield a competitive advantage• GT analysis “a good starting point”• Model it as an associative k-armed bandit problem with optimistic initial values and e- greedy action selection strategyTADA@IJCAI 2011 Mertacor 16
  • 17. State, Action, Reward• State – Quantized VPC (x11) – Capacity (x3) – Query Type (x3) – Manufacturer Specialty Bonus (x2) – Component Specialty Bonus (x2)• a = {0.28, 0.29, 0.3, 0.31, 0.32}• r = daily profitsTADA@IJCAI 2011 Mertacor 17
  • 18. Experiment (1/2)• Self-play Agent Name Score – 210 games Mertacor-Std-1 53.042 – All capacities to 450 Mertacor-Std-2 52.763 (MEDIUM) Mertacor-kNN-1 52.673• The standard agent is Mertacor-kNN-2 52.703 unbeatable since it is created Mertacor-RL-1 52.270 that way Mertacor-RL-2 52.233 Mertacor-Full-1 51.673 Mertacor-Full-2 51.899TADA@IJCAI 2011 Mertacor 18
  • 19. Experiment (2/2)• Mix-up things, include more Agent Name Score agents with different strategies Mertacor-kNN 53.223 – 250 games – All capacities to 450 (MEDIUM) Mertacor-Std 52.245 Schlemazl (2010) 51.975• Better estimation lead to better performance Mertacor-Full 51.796 Mertacor-RL 51.790• Adaptiveness is suited for even more complicated Epflagent (2010) 49.232 environments Tau (2010) 45.987 (capacity and strategy wise) Crocodile (2010) 45.858TADA@IJCAI 2011 Mertacor 19
  • 20. 2011 Also tested/under development• Daily Campaign Budget Threshold algorithms – Estimation – Simulation• Particle Filtering for user state estimation – TacTexTADA@IJCAI 2011 Mertacor 20
  • 21. Conclusions & Future Work• α = 0.3 is a very powerful conclusion/hard to beat• Better estimates for B) user state and C) Id could further improve performance• On-line learning still in very crude form – Not yet satisfied but seems a reasonable thing to do• Competition-wise: fitted-Q learning from data logsTADA@IJCAI 2011 Mertacor 21
  • 22. Thank you for your attention Questions?