This document summarizes a research paper that proposes a framework for sequential ad selection to optimize online advertising revenue. It formulates the problem as a partially observable Markov decision process (POMDP) and provides exact and approximate solutions. The paper evaluates the proposed approach on a public dataset and finds that considering dependencies between ads and using a POMDP formulation outperforms baseline methods like selecting ads randomly or based only on immediate reward.
Video: https://youtu.be/dYhrCUFN0eM
Article: https://medium.com/p/the-gentlest-introduction-to-tensorflow-248dc871a224
Code: https://github.com/nethsix/gentle_tensorflow/blob/master/code/linear_regression_one_feature.py
This alternative introduction to Google's official Tensorflow (TF) tutorial strips away the unnecessary concepts that overly complicates getting started. The goal is to use TF to perform Linear Regression (LR) that has only a single-feature. We show how to model the LR using a TF graph, how to define the cost function to measure how well the an LR model fits the dataset, and finally train the LR model to find the best fit model.
Gentlest Introduction to Tensorflow - Part 3Khor SoonHin
Articles:
* https://medium.com/all-of-us-are-belong-to-machines/gentlest-intro-to-tensorflow-part-3-matrices-multi-feature-linear-regression-30a81ebaaa6c
* https://medium.com/all-of-us-are-belong-to-machines/gentlest-intro-to-tensorflow-4-logistic-regression-2afd0cabc54
Video: https://youtu.be/F8g_6TXKlxw
Code: https://github.com/nethsix/gentle_tensorflow
In this part, we:
* Use Tensorflow for linear regression models with multiple features
* Use Tensorflow for logistic regression models with multiple features. Specifically:
* Predict multi-class/discrete outcome
* Explain why we use cross-entropy as cost function
* Explain why we use softmax
* Tensorflow Cheatsheet #1
* Single feature linear regression
* Multi-feature linear regression
* Multi-feature logistic regression
TensorFlow is a wonderful tool for rapidly implementing neural networks. In this presentation, we will learn the basics of TensorFlow and show how neural networks can be built with just a few lines of code. We will highlight some of the confusing bits of TensorFlow as a way of developing the intuition necessary to avoid common pitfalls when developing your own models. Additionally, we will discuss how to roll our own Recurrent Neural Networks. While many tutorials focus on using built in modules, this presentation will focus on writing neural networks from scratch enabling us to build flexible models when Tensorflow’s high level components can’t quite fit our needs.
About Nathan Lintz:
Nathan Lintz is a research scientist at indico Data Solutions where he is responsible for developing machine learning systems in the domains of language detection, text summarization, and emotion recognition. Outside of work, Nathan is currently writting a book on TensorFlow as an extension to his tutorial repository https://github.com/nlintz/TensorFlow-Tutorials
Link to video https://www.youtube.com/watch?v=op1QJbC2g0E&feature=youtu.be
Explanation on Tensorflow example -Deep mnist for expert홍배 김
you can find the exact and detailed network architecture of 'Deep mnist for expert' example of tensorflow's tutorial. I also added descriptions on the program for your better understanding.
Cornell University’s Hod Lipson is seeking to understand if machines can learn analytical laws automatically. For centuries, scientists have attempted to identify and document analytical laws underlying physical phenomena in nature. Despite the prevalence of computing power, the process of finding natural laws and their corresponding equations has resisted automation. Lipson has developed machines that take in information about their environment and discover natural laws all on their own, even learning to walk.
Stochastic Differential Equations: Application to Pension Funds under Adverse...Marius García Meza
Shown in #MCDE2014 Magno Coloquio de Doctorantes en Economía 2014, in Mexico City, November 2014.
Rights Reserved to Escuela Superior de Economía, Instituto Politécnico Nacional
CIKM 2013 Tutorial: Real-time Bidding: A New Frontier of Computational Advert...Shuai Yuan
Computational Advertising has been an important topical area in information retrieval and knowledge management. This tutorial will be focused on real-time advertising, aka Real-Time Bidding (RTB), the fundamental shift in the field of computational advertising. It is strongly related to CIKM areas such as user log analysis and modelling, information retrieval, text mining, knowledge extraction and management, behaviour targeting, recommender systems, personalization, and data management platform.
This tutorial aims to provide not only a comprehensive and systemic introduction to RTB and computational advertising in general, but also the emerging research challenges and research tools and datasets in order to facilitate the research. Compared to previous Computational Advertising tutorials in relevant top-tier conferences, this tutorial takes a fresh, neutral, and the latest look of the field and focuses on the fundamental changes brought by RTB.
We will begin by giving a brief overview of the history of online advertising and present the current eco-system in which RTB plays an increasingly important part. Based on our field study and the DSP optimisation contest organised by iPinyou, we analyse optimization problems both from the demand side (advertisers) and the supply side (publishers), as well as the auction mechanism design challenges for Ad exchanges. We discuss how IR, DM and ML techniques have been applied to these problems. In addition, we discuss why game theory is important in this area and how it could be extended beyond the auction mechanism design.
CIKM is an ideal venue for this tutorial because RTB is an area of multiple disciplines, including information retrieval, data mining, knowledge discovery and management, and game theory, most of which are traditionally the key themes of the conference. As an illustration of practical application in the real world, we shall cover algorithms in the iPinyou global DSP optimisation contest on a production platform; for the supply side, we also report experiments of inventory management, reserve price optimisation, etc. in production systems.
We expect the audience, after attending the tutorial, to understand the real-time online advertising mechanisms and the state of the art techniques, as well as to grasp the research challenges in this field. Our motivation is to help the audience acquire domain knowledge and obtain relevant datasets, and to promote research activities in RTB and computational advertising in general.
In display and mobile advertising, the most significant development in recent years is the Real-Time Bidding (RTB), which allows selling and buying in real-time one ad impression at a time. The ability of making impression level bid decision and targeting to an individual user in real-time has fundamentally changed the landscape of the digital media. The further demand for automation, integration and optimisation in RTB brings new research opportunities in the IR fields, including information matching with economic constraints, CTR prediction, user behaviour targeting and profiling, personalised advertising, and attribution and evaluation methodologies. In this tutorial, teamed up with presenters from both the industry and academia, we aim to bring the insightful knowledge from the real-world systems, and to provide an overview of the fundamental mechanism and algorithms with the focus on the IR context. We will also introduce to IR researchers a few datasets recently made available so that they can get hands-on quickly and enable the said research.
Video: https://youtu.be/dYhrCUFN0eM
Article: https://medium.com/p/the-gentlest-introduction-to-tensorflow-248dc871a224
Code: https://github.com/nethsix/gentle_tensorflow/blob/master/code/linear_regression_one_feature.py
This alternative introduction to Google's official Tensorflow (TF) tutorial strips away the unnecessary concepts that overly complicates getting started. The goal is to use TF to perform Linear Regression (LR) that has only a single-feature. We show how to model the LR using a TF graph, how to define the cost function to measure how well the an LR model fits the dataset, and finally train the LR model to find the best fit model.
Gentlest Introduction to Tensorflow - Part 3Khor SoonHin
Articles:
* https://medium.com/all-of-us-are-belong-to-machines/gentlest-intro-to-tensorflow-part-3-matrices-multi-feature-linear-regression-30a81ebaaa6c
* https://medium.com/all-of-us-are-belong-to-machines/gentlest-intro-to-tensorflow-4-logistic-regression-2afd0cabc54
Video: https://youtu.be/F8g_6TXKlxw
Code: https://github.com/nethsix/gentle_tensorflow
In this part, we:
* Use Tensorflow for linear regression models with multiple features
* Use Tensorflow for logistic regression models with multiple features. Specifically:
* Predict multi-class/discrete outcome
* Explain why we use cross-entropy as cost function
* Explain why we use softmax
* Tensorflow Cheatsheet #1
* Single feature linear regression
* Multi-feature linear regression
* Multi-feature logistic regression
TensorFlow is a wonderful tool for rapidly implementing neural networks. In this presentation, we will learn the basics of TensorFlow and show how neural networks can be built with just a few lines of code. We will highlight some of the confusing bits of TensorFlow as a way of developing the intuition necessary to avoid common pitfalls when developing your own models. Additionally, we will discuss how to roll our own Recurrent Neural Networks. While many tutorials focus on using built in modules, this presentation will focus on writing neural networks from scratch enabling us to build flexible models when Tensorflow’s high level components can’t quite fit our needs.
About Nathan Lintz:
Nathan Lintz is a research scientist at indico Data Solutions where he is responsible for developing machine learning systems in the domains of language detection, text summarization, and emotion recognition. Outside of work, Nathan is currently writting a book on TensorFlow as an extension to his tutorial repository https://github.com/nlintz/TensorFlow-Tutorials
Link to video https://www.youtube.com/watch?v=op1QJbC2g0E&feature=youtu.be
Explanation on Tensorflow example -Deep mnist for expert홍배 김
you can find the exact and detailed network architecture of 'Deep mnist for expert' example of tensorflow's tutorial. I also added descriptions on the program for your better understanding.
Cornell University’s Hod Lipson is seeking to understand if machines can learn analytical laws automatically. For centuries, scientists have attempted to identify and document analytical laws underlying physical phenomena in nature. Despite the prevalence of computing power, the process of finding natural laws and their corresponding equations has resisted automation. Lipson has developed machines that take in information about their environment and discover natural laws all on their own, even learning to walk.
Stochastic Differential Equations: Application to Pension Funds under Adverse...Marius García Meza
Shown in #MCDE2014 Magno Coloquio de Doctorantes en Economía 2014, in Mexico City, November 2014.
Rights Reserved to Escuela Superior de Economía, Instituto Politécnico Nacional
CIKM 2013 Tutorial: Real-time Bidding: A New Frontier of Computational Advert...Shuai Yuan
Computational Advertising has been an important topical area in information retrieval and knowledge management. This tutorial will be focused on real-time advertising, aka Real-Time Bidding (RTB), the fundamental shift in the field of computational advertising. It is strongly related to CIKM areas such as user log analysis and modelling, information retrieval, text mining, knowledge extraction and management, behaviour targeting, recommender systems, personalization, and data management platform.
This tutorial aims to provide not only a comprehensive and systemic introduction to RTB and computational advertising in general, but also the emerging research challenges and research tools and datasets in order to facilitate the research. Compared to previous Computational Advertising tutorials in relevant top-tier conferences, this tutorial takes a fresh, neutral, and the latest look of the field and focuses on the fundamental changes brought by RTB.
We will begin by giving a brief overview of the history of online advertising and present the current eco-system in which RTB plays an increasingly important part. Based on our field study and the DSP optimisation contest organised by iPinyou, we analyse optimization problems both from the demand side (advertisers) and the supply side (publishers), as well as the auction mechanism design challenges for Ad exchanges. We discuss how IR, DM and ML techniques have been applied to these problems. In addition, we discuss why game theory is important in this area and how it could be extended beyond the auction mechanism design.
CIKM is an ideal venue for this tutorial because RTB is an area of multiple disciplines, including information retrieval, data mining, knowledge discovery and management, and game theory, most of which are traditionally the key themes of the conference. As an illustration of practical application in the real world, we shall cover algorithms in the iPinyou global DSP optimisation contest on a production platform; for the supply side, we also report experiments of inventory management, reserve price optimisation, etc. in production systems.
We expect the audience, after attending the tutorial, to understand the real-time online advertising mechanisms and the state of the art techniques, as well as to grasp the research challenges in this field. Our motivation is to help the audience acquire domain knowledge and obtain relevant datasets, and to promote research activities in RTB and computational advertising in general.
In display and mobile advertising, the most significant development in recent years is the Real-Time Bidding (RTB), which allows selling and buying in real-time one ad impression at a time. The ability of making impression level bid decision and targeting to an individual user in real-time has fundamentally changed the landscape of the digital media. The further demand for automation, integration and optimisation in RTB brings new research opportunities in the IR fields, including information matching with economic constraints, CTR prediction, user behaviour targeting and profiling, personalised advertising, and attribution and evaluation methodologies. In this tutorial, teamed up with presenters from both the industry and academia, we aim to bring the insightful knowledge from the real-world systems, and to provide an overview of the fundamental mechanism and algorithms with the focus on the IR context. We will also introduce to IR researchers a few datasets recently made available so that they can get hands-on quickly and enable the said research.
Pricing average price advertising options when underlying spot market prices ...Bowei Chen
Advertising options have been recently studied as a special type of guaranteed contracts in online advertising, which are an alternative sales mechanism to real-time auctions. An advertising option is a contract which gives its buyer a right but not obligation to enter into transactions to purchase page views or link clicks at one or multiple pre-specified prices in a specific future period. Different from typical guaranteed contracts, the option buyer pays a lower upfront fee but can have greater flexibility and more control of advertising. Many studies on advertising options so far have been restricted to the situations where the option payoff is determined by the underlying spot market price at a specific time point and the price evolution over time is assumed to be continuous. The former leads to a biased calculation of option payoff and the latter is invalid empirically for many online advertising slots. This paper addresses these two limitations by proposing a new advertising option pricing framework. First, the option payoff is calculated based on an average price over a specific future period. Therefore, the option becomes path-dependent. The average price is measured by the power mean, which contains several existing option payoff functions as its special cases. Second, jump-diffusion stochastic models are used to describe the movement of the underlying spot market price, which incorporate several important statistical properties including jumps and spikes, non-normality, and absence of autocorrelations. A general option pricing algorithm is obtained based on Monte Carlo simulation. In addition, an explicit pricing formula is derived for the case when the option payoff is based on the geometric mean. This pricing formula is also a generalized version of several other option pricing models discussed in related studies.
Asset Prices in Segmented and Integrated Marketsguasoni
This paper evaluates the effect of market integration on prices and welfare, in a model where two Lucas trees grow in separate regions with similar investors. We find equilibrium asset price dynamics and welfare both in segmentation, when each region holds its own asset and consumes its dividend, and in integration, when both regions trade both assets and consume both dividends. Integration always increases welfare. Asset prices may increase or decrease, depending on the time of integration, but decrease on average. Correlation in assets' returns is zero or negative before integration, but significantly positive afterwards, explaining some effects commonly associated with financialization.
This is the entrance exam paper for ISI MSQE Entrance Exam for the year 2010. Much more information on the ISI MSQE Entrance Exam and ISI MSQE Entrance preparation help available on http://crackdse.com
A crash coarse in stochastic Lyapunov theory for Markov processes (emphasis is on continuous time)
See also the survey for models in discrete time,
https://netfiles.uiuc.edu/meyn/www/spm_files/MarkovTutorial/MarkovTutorialUCSB2010.html
Representation of signals & Operation on signals
(Time Reversal, Time Shifting , Time Scaling, Amplitude scaling, Signal addition, Signal Multiplication)
MATLAB is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages
Multi-keyword multi-click advertisement option contracts for sponsored searchBowei Chen
In sponsored search, advertisement (abbreviated ad) slots are usually sold by a search engine to an advertiser through an auction mechanism in which advertisers bid on keywords. In theory, auction mechanisms have many desirable economic properties. However, keyword auctions have a number of limitations including: the uncertainty in payment prices for advertisers; the volatility in the search engine’s revenue; and the weak loyalty between advertiser and search engine. In this article, we propose a special ad option that alleviates these problems. In our proposal, an advertiser can purchase an option from a search engine in advance by paying an upfront fee, known as the option price. The advertiser then has the right, but no obligation, to purchase among the prespecified set of keywords at the fixed cost-per-clicks (CPCs) for a specified number of clicks in a specified period of time. The proposed option is closely related to a special exotic option in finance that contains multiple underlying assets (multi-keyword) and is also multi-exercisable (multi-click). This novel structure has many benefits: advertisers can have reduced uncertainty in advertising; the search engine can improve the advertisers’ loyalty as well as obtain a stable and increased expected revenue over time. Since the proposed ad option can be implemented in conjunction with the existing keyword auctions, the option price and corresponding fixed CPCs must be set such that there is no arbitrage between the two markets. Option pricing methods are discussed and our experimental results validate the development. Compared to keyword auctions, a search engine can have an increased expected revenue by selling an ad option.
1. Introduction
• Preliminaries
• Some Useful Definitions
• Types of fuzzy sets
• Degree of Fuzzy Sets
• Operators of Fuzzy Sets
• Conditions & Limitations
• Multiplication
• Summation
• Operators of Theory Sets
• Characteristics of S & T
• Some definitions for T & S
• Unity and Community Defs.
• Mean Operators
• Fuzzy AND & OR
• Combinations of Fuzzy AND & OR
2. Fuzzy Measurement & Measurement of Fuzzy Sets
• Fuzzy Measurement
• Dr. ASGARI Zadeh Possibility Definition
• SUGENO Definition
• Possibility Definition
• Graph of S Function
• Measurement of Fuzzy Sets
• Entropy of Fuzzy Sets (De Luca & Termini)
• YAGER Definition for Ã
3. Propagation principle
• Propagation principle & Applications
• Propagation principle and Second Types of Fuzzy Sets
• Fuzzy Numbers & Algebraic Operations
• Fuzzy Numbers Intervals
• L-R Interval Function (Asymmetric)
• L-R Interval Function
• L-R Interval Function Operations
4. Functions & Fuzzy Analyzing
• Functions & Fuzzy Analyzing
• Functions & Fuzzy Analyzing
• Fuzzy functions Extremes
• Integral of Fuzzy Functions
• Integral of Type 2 fuzzy function with definite interval
• Differentiation of Definite functions With Fuzzy Domains & Ranges
• Integral of fuzzy function with definite interval
• Properties of fuzzy Integral
• Integral of Definite functions with fuzzy interval
5. Relations & Fuzzy Graphs
• Fuzzy Relations
• Fuzzy Graphs in Fuzzy Sets.
• Fuzzy Images in 2-D Graphs
• Fuzzy Images in n-D Graphs
• Operations in Fuzzy Graphs
• Fuzzy Forests
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Normal Labour/ Stages of Labour/ Mechanism of LabourWasim Ak
Normal labor is also termed spontaneous labor, defined as the natural physiological process through which the fetus, placenta, and membranes are expelled from the uterus through the birth canal at term (37 to 42 weeks
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Introduction to AI for Nonprofits with Tapp Network
Sequential Selection of Correlated Ads by POMDPs
1. Sequential Selection of Correlated Ads by
POMDPs
Shuai Yuan, Jun Wang
University College London
October 29, 2012
2. Motivations and contributions
Motivations,
• help publishers gain more profit by displaying ads;
• go further than offline, content-based matching of
webpages and ads;
Contributions,
• a framework of ad selection for revenue optimisation;
• formulating the sequential selection problem by Partially
observable Markov decision process and providing exact
and approximate solutions;
• a public keyword-bid-ad-webpage dataset for reproducible
research1 .
1
http://www.computational-advertising.org
3. Related works
Contextual advertising,
• A semantic approach to contextual advertising [Broder 2007]
• Impedance coupling in content-targeted advertising [Ribeiro 2005]
• Contextual advertising by combining relevance with click feedback [Chakrabarti
2008]
Inventory management (contracts),
• Targeted advertising on the Web with inventory management [Chickering 2003]
• Revenue management for online advertising: Impatient advertisers
[Fridgeirsdottir 2007]
• Dynamic revenue management for online display advertising [Roels 2009]
Optimal pricing model,
• Pricing of Online Advertising: Cost-Per-Click-Through Vs. Cost-Per-Action [Hu
2010]
• Online advertising: Pay-per-view versus pay-per-click [Mangani 2004]
• Online advertising: Pay-per-view versus pay-per-click A comment [Fjell 2009]
• Single period balancing of pay-per-click and pay-per-view online display
advertisements [Kwon 2011]
4. Related works (cont.)
Ad scheduling,
• Scheduling advertisements on a web page to maximize revenue [Kumar 2006]
• Scheduling of dynamic in-game advertising [Turner 2011]
Multi-armed bandits,
• Using confidence bounds for exploitation-exploration trade-offs [Auer 2003]
• Multi-armed bandit problems with dependent arms [Pandey 2007]
POMDPs,
• A survey of POMDP applications [Cassandra 1998]
• Monte Carlo POMDPs [Thrun 2000]
• Perseus: Randomized point-based value iteration for POMDPs [Spaan 2005]
5. Problem statement - setup
500
400
300
200
100
0 200 400 600 800 1000
$ 500
400
300
200
100
0 200 400 600 800 1000
500
400
300
200
100
0 200 400 600 800 1000
Figure : 1 webpage, 1 ad slot, M impressions at each time step.
2
Payoff of ads follows X ∼ N (µ, I · σ0 ). µ is generated by µ ∼ N (θ, Σ).
6. Problem statement - graphical model
θ(1), Σ(1), T-1 θ(2), Σ(2), T-2 θ(T), Σ(T), 0
s(1) s(2) θ, Σ s(T)
μ(1) μ(2) μ(T)
2
σ 0
x(1) x(2) x(T)
Figure : The payoff model illustrated by an influence diagram
representation with generative processes of a finite horizon POMDP.
s(t) is the selection action. θ(t), Σ(t) is the belief at some stage.
7. Problem statement - object function
To maximise the expected cumulative payoff over time,
T T
∗
π = arg max E [Rπ (T )] = arg max E Xs(t) (t) = arg max E Xs(t) (t)
π π π
t=1 t=1
T T
=arg max xs(t) (t)p(xs(t) (t)|Ψ(t))dx = arg max θs(t) (t) (1)
π x π
t=1 t=1
where,
• s(t) is the selection decision;
• Ψ(t) is the available information;
• π is a selection policy and π ∗ is the optimal one;
• “M impressions” is dropped from object function.
8. Belief update
$
t=1 t=2 ...
Figure : Updating belief on ads’ performance over time.
9. Belief update - the selected ad
We update the belief using Bayes’ theorem.
p (x1 |x1 (t), Ψ(t))
= p (x1 |x1 (t), Ψ(t), µ1 ) p (µ1 |x1 (t), Ψ(t))dµ (2)
by “completing squares”,
p µ1 |x1 (t), Ψ(t) ∝ p(x1 (t)|µ1 , Ψ(t))p(µ1 |Ψ(t))
2 2
∝ exp − x1 (t) − µ1 − µ1 − θ1 (t) (3)
we obtain the new belief,
2
µ1 |x1 (t) ∼ N θ1 (t + 1), σ1 (t + 1) (4)
2 2
σ1 (t)x1 (t) + σ0 θ1 (t) 2
σ1 (t)σ02
2
θ1 (t + 1) = 2 2
σ1 (t + 1) = 2 (t) + σ 2
(5)
σ1 (t) + σ0 σ1 0
we write θi (t) and σi2 (t) as the shorthand for θi |Ψ(t) and σi2 |Ψ(t).
10. Belief update - the correlated ad
We also update the belief of non-selected ads,
p (x2 |x1 (t), Ψ(t)) = p (x2 |µ2 , x1 (t), Ψ(t)) p(µ2 |x1 (t), Ψ(t))dµ2 (6)
with linear Gaussian property,
2
µ1 |µ2 ∼ N (θ1 |µ2 , σ1 |µ2 ) (7)
2
σ1,2
σ1,2 2 2
θ1 |µ2 = θ1 + 2
(µ2 − θ2 ) σ1 |µ2 = σ1 − 2
(8)
σ2 σ2
we obtain the new belief on a correlated ad,
2
µ2 |x1 (t) ∼ N (θ2 (t + 1), σ2 (t + 1)) (9)
2
σ1,2
x1 (t) − θ1 (t) 2 2
θ2 (t + 1) = θ2 (t) + σ1,2 2 2
σ2 (t + 1) = σ2 (t) − 2 (t) + 2
(10)
σ1 (t) + σ0 σ1 σ0
11. Belief update - expected payoff
We also obtain the expected payoff of the selected ad,
2 2
X1 |x1 (t), Ψ(t) ∼ N θ1 (t + 1), σ0 + σ1 (t + 1) (11)
and the expected payoff of the correlated ad,
2 2
X2 |x1 (t), Ψ(t) ∼ N θ2 (t + 1), σ0 + σ2 (t + 1) (12)
The final objective function is,
T
π ∗ = arg max θs(t) (t) subject to (13)
π
t=1
xs(t) (t) − θs(t) (t)
θs(t+1) (t + 1) = θs(t+1) (t) + σs(t),s(t+1) 2 2
(14)
σs(t) (t) + σ0
2
σs(t),s(t+1)
2 2
σs(t+1) (t + 1) = σs(t+1) (t) − 2 2
(15)
σs(t) (t) + σ0
12. POMDP formulation and solution
(belief state)
500
400
300
(observation 200
& reward) (action) 100
0 200 400 600 800 1000
$ 500
400
300
200
100
(hidden state) 0 200 400 600 800 1000
500
400
300
200
100
0 200 400 600 800 1000
Figure : The POMDP model for the revenue optimisation problem.
(θ(t), Σ(t)) is belief at some stage; x(t) is observation and reward;
s(t) is action; (θ, Σ) is the hidden state. There is no state transition.
13. Value iteration and MAB approximation
The value function could be expressed as,
s(t)= arg max Vs(t) (Ψ(t)) = arg max
¯
(xi ) + ξ(Ψ(t), i)
s(t)∈N i∈N
the expected immediate reward the expected future reward
(16)
The exact solution using Value iteration2 :
V ∗ (θ, Σ, T ) = max E Xs(t) (1) + V ∗ θ|Xs(t) (1), Σ|Xs(t) (1), T − 1 (17)
s(1)∈N
The approximation based on multi-armed bandit3 :
qi − ti θi2 (t) t −1
ξUCB 1- NORMAL = 16 · · (18)
ti − 1 ti
2
R. E. Bellman. (1957) “Dynamic Programming”
3
Auer, P. et al. (2002) “Finite-time analysis of the multi-armed bandit
problem”
14. Value iteration with Monte Carlo sampling4
We use sampling to reduce the computational complexity,
1: function VALUE F UNC(θ, Σ, t)
2: array V ← 0 Expected reward vector.
3: loop i ← 1 to N
4: V [i] ← θi (t) Expected immediate reward.
5: if t < T then
6: for all s in S AMPLE(θ, Σ) do
7: [θ , Σ ] ← U PDATE B ELIEF(θ, Σ, s, i)
New belief after selecting i and observing s.
Equations 13.
1
8: V [i] ← V [i] + M VALUE F UNC(θ , Σ , t + 1)
0
9: end for
10: end if
11: end loop
12: return [M AX(V ), M AX I NDEX(V )]
13: end function
4
Thrun, S. (2000) “Monte Carlo POMDPs”
15. Multi-armed bandit based approximation
(cont.)
The UCB 1- NORMAL - COR algorithm:
1: function P LAN(θ, Σ, Ψ(t))
2: array V ← 0
3: loop i ← 1 to N
4: if ti < 8 log t then ti is the number of times ad i gets selected.
5: return i
6: end if
7: end loop
8: [θ , Σ ] ← U PDATE B ELIEF(θ, Σ, Ψ(t))
New belief of all ads with all available information.
Equations 13.
9: loop i ← 1 to N
q −t θ 2
10: V [i] ← θi + 16 · i t −1i · t−1
i
ti
Expected reward.
i
11: end loop
12: return [MAX(V ), M AX I NDEX(V )]
13: end function
16. Experiment datasets
ad network/exchange
Google AdWords INTRANET
Traffic Estimator
service $
$$$ $$
advertisers publishers
• publishers gain 68% of advertisers’ spending (2003);
• data was collected from 12/2011 to 05/2012;
• 512 different keywords, 310 with non-zero mean payoff, 8
categories;
• 20% for training and 80% for testing;
• we consider each keyword to be an ad.
17. Competing algorithms
We compare the following algorithms,
• RANDOM policy, which selects candidates randomly
(uniform);
• MYOPIC policy, based on the expected immediate reward;
• UCB 1 policy, which assumes independent between arms
and is model-free of reward distribution;
• UCB 1- NORMAL policy, which assumes independent
between arms and the reward following Gaussian
distribution;
• VI - COR policy, which solves Value iteration using Monte
Carlo sampling; and
• UCB 1- NORMAL - COR policy, which consider the
dependencies between candidates.
18. Results
Datasets MYOPIC RANDOM UCB 1 UCB 1- N VI - COR UCB 1- N - COR
Education 21.9 23.0 30.9 30.9 41.2* 27.6
Finance-1 38.5 27.8 40.9 26.4 44.5 27.4
Finance-2 22.1 16.5 30.6 22.8 38.0* 22.9
Information 14.1 12.9 27.8 15.9 29.4 15.9
P&O 41.6 30.4 50.5 31.4 72.9* 63.3
Shopping-1 17.4 10.6 42.3 16.1 40.2 16.4
Shopping-2 29.9 14.5 34.3 75.3 52.9 79.2*
Shopping-3 9.7 4.3 21.9 18.3 27.3 19.4
P&S 24.7 26.0 47.2 57.1 67.9* 59.9
Medical 30.5 19.6 52.7 32.2 58.0* 33.5
Table : The cumulative payoffs are averaged on 8 chunks then normalized w.r.t the
GOLDEN policy for a better representation. The one with highest cumulative payoff is
in bold and with ∗ if the difference with the second best is significant by Wilcoxon
signed-rank test. P&O is “People & organisations” and P&S is “‘Products & services”.
19. Results (cont.)
VI COR
UCB1 Normal COR
4000
UCB1 Normal
UCB1
Golden
Myopic
3000
Random
2000
1000
20 40 60 80 100
Figure : Cumulative payoff on “People & organization” category, 5
candidates.
20. Results (cont.)
1
Myopic
0.9 VI-Cor
UCB1-Normal
0.8
Normalized cumulative payoff
UCB1-Normal-Cor
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
Edu F-1 F-2 Info P&O S-1 S-2 S-3 P&S Med
Figure : Comparison of accumulated payoffs on the 10 datasets.
VI-COR always performed better than MYOPIC and UCB1-NORMAL-COR
always performed better than UCB1-NORMAL across all datasets.
21. Results (cont.)
5000
best phones
4500 term insurance
4000
3500
Daily payoff
3000
2500
2000
1500
1000
500
0
0 50 100 150
Day
Figure : Special case: the daily payoff of two candidates with a
sudden change.
22. Results (cont.)
4
x 10
10
Golden
Myopic
9 VI−COR
UCB1−Normal−COR
8
Cumulative payoff
Figure : The
7
impact of the noise
2
6
factor σ0 for the
situation in the
5 previous figure.
4
3 −2 0 2 4
10 10 10 10
Noise factor σ2
0
xs(t) (t) − θs(t) (t)
θs(t+1) (t + 1) = θs(t+1) (t) + σs(t),s(t+1)
2 2
σs(t) (t) + σ0
23. Future works
• correlated update: if ad a1 on webpage w1 was shown to
user u1 and we observed its performance, what’s the belief
on performance of ad a2 on webpage w2 when showing to
user u2 with correlations known?
• multiple ads with diversification (another exploration and
exploitation dilemma);
• better solution for our continuous POMDP problem.