SlideShare a Scribd company logo
1 of 22
Download to read offline
Personalized List
Recommendation based on
Multi-armed Bandit Algorithms
Weiwen LIU
Computer Science & Engineering
Chinese University of Hong Kong
wwliu@cse.cuhk.edu.hk
wwliu, Term Presentation, Term 1
Content
oBackground
• Existing Methods
• Multi-armed Bandits
• Dependency Click Model
oAlgorithm
oResults
oExperiments
oConclusion and Future Work
2
wwliu, Term Presentation, Term 1
Background
oFor users:
• How to discover interesting items like music/news/apps
among large amount of items.
oFor companies:
• How to create economic opportunities.
• How to provide better personalized services.
3
wwliu, Term Presentation, Term 1
Existing methods
oContent-based Method/Collaborative Filtering
• Pros: perform well when user have enough click or
download records.
• Cons: cold-start problem
oContext-based Method/Regression
• Pros: efficient and easy to implement
• Cons: lack of diversity
4
Exploration vs Exploitation?
wwliu, Term Presentation, Term 1
Multi-armed bandits
o Rewards 𝒙𝑖,1, 𝒙𝑖,2, … of machine 𝑖 are i.i.d. 0,1 -valued
random variables
o An allocation policy prescribes which machine 𝑰 𝑡 to play at
time 𝑡 based on the realization of 𝒙 𝑰1,1
, … , 𝒙 𝑰 𝑡−1,𝑡−1
o The target is to play as often as possible the machine with
largest reward expectation
𝜇∗
= max
𝑖=1,…,𝐾
𝔼[𝑥𝑖]
5
wwliu, Term Presentation, Term 1
Bandit Solutions
oStochastic Bandits:
• Select items repeatedly and separately, one at each time
• Limitations: ignores the underlying relations; high
computational cost
oCombinatorial Cascade Bandits:
• Select a set of sequence of arms
• Limitations: can only deal with single click setting
6
wwliu, Term Presentation, Term 1
Click Models
oCascade Click Model:
• Stop when first click occurs
• Can only model single click
oDependency Click Model:
• Introduce a set of termination parameters
• Can handle settings with multiple click
7
1
2
3
wwliu, Term Presentation, Term 1
Dependency Click Model
o Allow user continue to
check more items after a
click.
o An extension of the
Cascade Model
• Can be reduced to CM if
the termination weights
ҧ𝑣 𝑘 = 1
8
Examine next
item ak
Attracted by the
item?
Would like to
terminate?
Reach the end of the
list?
Start
Satisfied Not satisfied
Yes
Yes
No
Yes
No
No
w(ak)
v(k)¯
¯
wwliu, Term Presentation, Term 1
Problem Formulation
o Given ground item set 𝐸 = 1, … , 𝐿 , a contextual vector 𝒙𝑖,𝑡 ∈ ℝd
is known to
the agent at time 𝑡.
o Attraction weight 𝒘 𝑡 𝑎 ∈ 0,1 𝐸
• is 𝑤𝑡(𝑎)-biased Bernoulli r.m.
• denotes whether user is attracted by 𝑎 or not.
• the attraction weights 𝒘 𝑡 𝑎 𝑡=1
𝑛
are i.i.d
o Termination weight 𝒗 𝑡 𝑘 ∈ 0,1 𝐾
• is ҧ𝑣(𝑘)-biased Bernoulli r.m.
• denotes where user wants to terminate examining the list
• only depends on the position 𝑘
• the termination weights 𝒗 𝑡 𝑘 𝑡=1
𝑛
are i.i.d
9
Recommended list
𝑨 𝑡 = (𝒂1
𝑡
, … , 𝒂 𝐾
𝑡
)
Feedback 010 ⋯ 100
wwliu, Term Presentation, Term 1
Objective
o The reward function is defined as
𝑓 𝐴, 𝑣, 𝑤 = 1 − ෑ
𝑘=1
𝐾
(1 − 𝑣 𝑘 𝑤(𝑎 𝑘))
indicating that 𝑓 𝑨 𝑡, 𝒗 𝑡, 𝒘 𝑡 = 1 if user clicks on a item, feels
satisfied and terminates examination.
o The pseudo-regret is defined as
ℛ 𝑛 = 𝔼 ෍
𝑡=1
𝑛
(𝑓 𝐴 𝑡
∗
, 𝑣 𝑡, 𝑤𝑡 − 𝑓(𝐴 𝑡, 𝑣 𝑡, 𝑤𝑡))
10
wwliu, Term Presentation, Term 1
Partial Knowledge
oClick sequence is the only feedback for the agent
• The termination position is unobserved
• The reward is not revealed
11
010011000
1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9
reward=1
reward=0
Use feedback before the last click to update the model
wwliu, Term Presentation, Term 1
Proposed Model: attraction weight
oAssume the expected attraction weight 𝑤𝑡(𝑎)
follows
𝑤𝑡 𝑎 = 𝔼 𝒘 𝑡 𝑎 ℋ𝑡 = 𝜇(𝜃∗
⊤ 𝑥 𝑡,𝑎)
oUse the generalized linear model as a flexible
extension
• Admits a wider range of distributions, e.g. Gaussian,
binomial, Poisson…
12
Attracted
Or Not?
wwliu, Term Presentation, Term 1
Proposed Model: termination weight
o Due to the limited feedback, we assume the order of the
expected termination weights are known
• For simplicity of explanation, assume
ҧ𝑣 1 ≥ ⋯ ≥ ҧ𝑣(𝐾)
oThe expected reward is maximized by recommending the
more attractive item to the higher position.
13
Terminate
Or Not?
wwliu, Term Presentation, Term 1
Proposed Model: parameter estimation
o The model parameter 𝜃 can be estimated using MLE:
෍
𝑠=1
𝑡
෍
𝑘=1
𝐶𝑡
𝑤𝑠 𝑎 𝑘
𝑠
− 𝜇 𝜃⊤ 𝑥 𝑠,𝑎 𝑘
𝑠 𝑥 𝑠,𝑎 𝑘
𝑠 = 0.
o Upper Confidence Bound (UCB):
𝑈𝑡 𝑎 = min 𝜇 ෨𝜃𝑡−1
⊤
𝑥𝑡,𝑎 + 𝜌 𝑡 − 1 𝑥𝑡,𝑎 𝑉𝑡−1
−1 , 1 ,
where 𝛽𝑡
𝑎
𝛿 = 𝜌(𝑡) 𝑥𝑡,𝑎 𝑉𝑡
−1.
14
Lemma: For any 𝑡 ≥ 1 and 𝑎 ∈ 𝐸, denote
𝛽𝑡
𝑎
𝛿 =
2𝑘 𝜇
𝑐 𝜇
𝑥𝑡,𝑎 𝑉𝑡
−1 log
1 +
𝐾𝑡
𝜆𝑑
𝑑
𝛿2
.
For all 0 ≤ 𝛿 ≤ 1, with probability at least 1 − 𝛿, it holds that:
𝜇 𝜃∗
⊤
𝑥𝑡,𝑎 − 𝜇 ෪𝜃𝑡
⊤
𝑥𝑡,𝑎 ≤ 𝛽𝑡
𝑎
𝛿 , ∀𝑡 ≥ 1.
wwliu, Term Presentation, Term 1
Proposed Model: UCB
oAnalyze mean and a measure of uncertainty
(variance) for each item
oMake decisions based on mean + variance
15
0 0.2 0.4
B
C
A
wwliu, Term Presentation, Term 1
Proposed Model: UCB
oThe value of 𝜌(𝑡) decreases w.r.t 𝑡
oThe uncertainty of 𝐴 reduces after several time
step
oAutomatically balances exploration and exploitation
16
0 0.2 0.4
B
C
A
wwliu, Term Presentation, Term 1
Proposed Model: Algorithm
17
Recommend
based on UCB
Estimate 𝜃
Update
statistics
wwliu, Term Presentation, Term 1
Theoretical Results
o The upper bound is of 𝑂(𝑑 𝑛 log 𝑛) for the regret, which
depends linearly on the dimension 𝑑 of the feature space,
but not on the number 𝐿 of base arms.
18
Theorem: If the reward function is given as 𝑓 𝐴, 𝑣, 𝑤 = 1 − Π 𝑘=1
𝐾
(1 − 𝑣 𝑘 𝑤(𝑎 𝑘)), then the
cumulative regret ℛ(𝑛) of the proposed algorithm has the following bound,
ℛ 𝑛 ≤
4𝐾Δ 𝑣 𝑘 𝜇
𝑐 𝜇 𝑝∗
𝑑𝑛 𝐾 + 1 log
1 +
𝐾𝑛
𝜆𝑑
𝑑
𝛿2
log 1 +
𝐾𝑛
𝜆𝑑
,
where 𝑘 𝜇 is the Lipschitz constant, 𝑐 𝜇 = inf 𝜇 ′.
wwliu, Term Presentation, Term 1
Experimental Results
o Synthetic data
• L=200, K=4 and d=10
• 𝜇 𝑥 =
1
1+exp −𝑥
oGL-CDCM outperforms KL-
DCM by 80.27% and Lin-
CDCM by 49.04%.
19
wwliu, Term Presentation, Term 1
Experimental Results
o Real-world data
• 20M MovieLens data
• L=200, K=5, d=100
o GL-CDCM is 5.69 times of
that of KL-DCM and 1.45
times of that of Lin-CDCM
20
wwliu, Term Presentation, Term 1
Conclusion
oConclusion
• Formulate the DCM bandits problem
• Incorporate contextual information
• Make a weaker assumption on the expected attraction
weight function
• Prove a upper regret bound
oFuture work
• Prove a tighter bound
• Consider other practical click model
• Verify the effectiveness using more real-world dataset
21
Thank you
22

More Related Content

What's hot

Vehicle Routing Problem using PSO (Particle Swarm Optimization)
Vehicle Routing Problem using PSO (Particle Swarm Optimization)Vehicle Routing Problem using PSO (Particle Swarm Optimization)
Vehicle Routing Problem using PSO (Particle Swarm Optimization)Niharika Varshney
 
9th class Physics problems 2.1, 2.2
9th class Physics problems 2.1, 2.29th class Physics problems 2.1, 2.2
9th class Physics problems 2.1, 2.2Muhammad Zeeshan
 
transporation problem
transporation problemtransporation problem
transporation problemVivek Lohani
 
HMPC for Upper Stage Attitude Control
HMPC for Upper Stage Attitude ControlHMPC for Upper Stage Attitude Control
HMPC for Upper Stage Attitude ControlPantelis Sopasakis
 
operation research-modi
operation research-modioperation research-modi
operation research-modiMaharshi Soni
 
Robust model predictive control for discrete-time fractional-order systems
Robust model predictive control for discrete-time fractional-order systemsRobust model predictive control for discrete-time fractional-order systems
Robust model predictive control for discrete-time fractional-order systemsPantelis Sopasakis
 
Distributed solution of stochastic optimal control problem on GPUs
Distributed solution of stochastic optimal control problem on GPUsDistributed solution of stochastic optimal control problem on GPUs
Distributed solution of stochastic optimal control problem on GPUsPantelis Sopasakis
 
LP network chapter 5 transportation and assignment problem
LP  network chapter 5 transportation and assignment problemLP  network chapter 5 transportation and assignment problem
LP network chapter 5 transportation and assignment problemHarun Al-Rasyid Lubis
 
Introduction to vectors
Introduction to vectorsIntroduction to vectors
Introduction to vectorshamcosine
 
Physics 1.3 scalars and vectors
Physics 1.3 scalars and vectorsPhysics 1.3 scalars and vectors
Physics 1.3 scalars and vectorsJohnPaul Kennedy
 

What's hot (17)

Vam
VamVam
Vam
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
Vehicle Routing Problem using PSO (Particle Swarm Optimization)
Vehicle Routing Problem using PSO (Particle Swarm Optimization)Vehicle Routing Problem using PSO (Particle Swarm Optimization)
Vehicle Routing Problem using PSO (Particle Swarm Optimization)
 
9th class Physics problems 2.1, 2.2
9th class Physics problems 2.1, 2.29th class Physics problems 2.1, 2.2
9th class Physics problems 2.1, 2.2
 
transporation problem
transporation problemtransporation problem
transporation problem
 
HMPC for Upper Stage Attitude Control
HMPC for Upper Stage Attitude ControlHMPC for Upper Stage Attitude Control
HMPC for Upper Stage Attitude Control
 
Vector&scalar quantitiesppt
Vector&scalar quantitiespptVector&scalar quantitiesppt
Vector&scalar quantitiesppt
 
operation research-modi
operation research-modioperation research-modi
operation research-modi
 
Vectors and Kinematics
Vectors and KinematicsVectors and Kinematics
Vectors and Kinematics
 
Robust model predictive control for discrete-time fractional-order systems
Robust model predictive control for discrete-time fractional-order systemsRobust model predictive control for discrete-time fractional-order systems
Robust model predictive control for discrete-time fractional-order systems
 
Distributed solution of stochastic optimal control problem on GPUs
Distributed solution of stochastic optimal control problem on GPUsDistributed solution of stochastic optimal control problem on GPUs
Distributed solution of stochastic optimal control problem on GPUs
 
Vectors chap6
Vectors chap6Vectors chap6
Vectors chap6
 
Transportation technique
Transportation techniqueTransportation technique
Transportation technique
 
LP network chapter 5 transportation and assignment problem
LP  network chapter 5 transportation and assignment problemLP  network chapter 5 transportation and assignment problem
LP network chapter 5 transportation and assignment problem
 
Introduction to vectors
Introduction to vectorsIntroduction to vectors
Introduction to vectors
 
Physics 1.3 scalars and vectors
Physics 1.3 scalars and vectorsPhysics 1.3 scalars and vectors
Physics 1.3 scalars and vectors
 
Recursive Compressed Sensing
Recursive Compressed SensingRecursive Compressed Sensing
Recursive Compressed Sensing
 

Similar to Personalized list recommendation based on multi armed bandit algorithms

Tudelft stramien 16_9_on_optimization
Tudelft stramien 16_9_on_optimizationTudelft stramien 16_9_on_optimization
Tudelft stramien 16_9_on_optimizationPirouz Nourian
 
Model Selection and Validation
Model Selection and ValidationModel Selection and Validation
Model Selection and Validationgmorishita
 
Koh_Liang_ICML2017
Koh_Liang_ICML2017Koh_Liang_ICML2017
Koh_Liang_ICML2017Masa Kato
 
John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017
John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017 John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017
John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017 MLconf
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelineChenYiHuang5
 
Introduction to Optimization revised.ppt
Introduction to Optimization revised.pptIntroduction to Optimization revised.ppt
Introduction to Optimization revised.pptJahnaviGautam
 
Degree presentation: Indirect Inference Applied to Financial Econometrics
Degree presentation: Indirect Inference Applied to Financial EconometricsDegree presentation: Indirect Inference Applied to Financial Econometrics
Degree presentation: Indirect Inference Applied to Financial EconometricsJean Duchesne
 
Abductive commonsense reasoning
Abductive commonsense reasoningAbductive commonsense reasoning
Abductive commonsense reasoningSan Kim
 
Joint contrastive learning with infinite possibilities
Joint contrastive learning with infinite possibilitiesJoint contrastive learning with infinite possibilities
Joint contrastive learning with infinite possibilitiestaeseon ryu
 
Truss Analysis (Mechanics vs. Hypermesh)
Truss Analysis (Mechanics vs. Hypermesh)Truss Analysis (Mechanics vs. Hypermesh)
Truss Analysis (Mechanics vs. Hypermesh)Akshay Mistri
 
Monte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxMonte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxHaibinSu2
 
The Dark Side of the Universe
The Dark Side of the UniverseThe Dark Side of the Universe
The Dark Side of the UniverseRohanSrivastava56
 
Lecture8-SVMs-PartI-Feb17-2021.pptx
Lecture8-SVMs-PartI-Feb17-2021.pptxLecture8-SVMs-PartI-Feb17-2021.pptx
Lecture8-SVMs-PartI-Feb17-2021.pptxDuniaAbdelaziz
 
Development of Multi-Level ROM
Development of Multi-Level ROMDevelopment of Multi-Level ROM
Development of Multi-Level ROMMohammad
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsYoung-Geun Choi
 

Similar to Personalized list recommendation based on multi armed bandit algorithms (20)

Tudelft stramien 16_9_on_optimization
Tudelft stramien 16_9_on_optimizationTudelft stramien 16_9_on_optimization
Tudelft stramien 16_9_on_optimization
 
Owa method
Owa methodOwa method
Owa method
 
Model Selection and Validation
Model Selection and ValidationModel Selection and Validation
Model Selection and Validation
 
Koh_Liang_ICML2017
Koh_Liang_ICML2017Koh_Liang_ICML2017
Koh_Liang_ICML2017
 
John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017
John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017 John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017
John Maxwell, Data Scientist, Nordstrom at MLconf Seattle 2017
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
 
ansys tutorial
ansys tutorialansys tutorial
ansys tutorial
 
Introduction to Optimization revised.ppt
Introduction to Optimization revised.pptIntroduction to Optimization revised.ppt
Introduction to Optimization revised.ppt
 
Degree presentation: Indirect Inference Applied to Financial Econometrics
Degree presentation: Indirect Inference Applied to Financial EconometricsDegree presentation: Indirect Inference Applied to Financial Econometrics
Degree presentation: Indirect Inference Applied to Financial Econometrics
 
Abductive commonsense reasoning
Abductive commonsense reasoningAbductive commonsense reasoning
Abductive commonsense reasoning
 
Joint contrastive learning with infinite possibilities
Joint contrastive learning with infinite possibilitiesJoint contrastive learning with infinite possibilities
Joint contrastive learning with infinite possibilities
 
Truss Analysis (Mechanics vs. Hypermesh)
Truss Analysis (Mechanics vs. Hypermesh)Truss Analysis (Mechanics vs. Hypermesh)
Truss Analysis (Mechanics vs. Hypermesh)
 
Monte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxMonte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptx
 
The Dark Side of the Universe
The Dark Side of the UniverseThe Dark Side of the Universe
The Dark Side of the Universe
 
Lecture8-SVMs-PartI-Feb17-2021.pptx
Lecture8-SVMs-PartI-Feb17-2021.pptxLecture8-SVMs-PartI-Feb17-2021.pptx
Lecture8-SVMs-PartI-Feb17-2021.pptx
 
Distributed lag model koyck
Distributed lag model koyckDistributed lag model koyck
Distributed lag model koyck
 
Lecture_3.pdf
Lecture_3.pdfLecture_3.pdf
Lecture_3.pdf
 
Development of Multi-Level ROM
Development of Multi-Level ROMDevelopment of Multi-Level ROM
Development of Multi-Level ROM
 
OptimumEngineeringDesign-Day7.pdf
OptimumEngineeringDesign-Day7.pdfOptimumEngineeringDesign-Day7.pdf
OptimumEngineeringDesign-Day7.pdf
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
 

Recently uploaded

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 

Recently uploaded (20)

Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 

Personalized list recommendation based on multi armed bandit algorithms

  • 1. Personalized List Recommendation based on Multi-armed Bandit Algorithms Weiwen LIU Computer Science & Engineering Chinese University of Hong Kong wwliu@cse.cuhk.edu.hk
  • 2. wwliu, Term Presentation, Term 1 Content oBackground • Existing Methods • Multi-armed Bandits • Dependency Click Model oAlgorithm oResults oExperiments oConclusion and Future Work 2
  • 3. wwliu, Term Presentation, Term 1 Background oFor users: • How to discover interesting items like music/news/apps among large amount of items. oFor companies: • How to create economic opportunities. • How to provide better personalized services. 3
  • 4. wwliu, Term Presentation, Term 1 Existing methods oContent-based Method/Collaborative Filtering • Pros: perform well when user have enough click or download records. • Cons: cold-start problem oContext-based Method/Regression • Pros: efficient and easy to implement • Cons: lack of diversity 4 Exploration vs Exploitation?
  • 5. wwliu, Term Presentation, Term 1 Multi-armed bandits o Rewards 𝒙𝑖,1, 𝒙𝑖,2, … of machine 𝑖 are i.i.d. 0,1 -valued random variables o An allocation policy prescribes which machine 𝑰 𝑡 to play at time 𝑡 based on the realization of 𝒙 𝑰1,1 , … , 𝒙 𝑰 𝑡−1,𝑡−1 o The target is to play as often as possible the machine with largest reward expectation 𝜇∗ = max 𝑖=1,…,𝐾 𝔼[𝑥𝑖] 5
  • 6. wwliu, Term Presentation, Term 1 Bandit Solutions oStochastic Bandits: • Select items repeatedly and separately, one at each time • Limitations: ignores the underlying relations; high computational cost oCombinatorial Cascade Bandits: • Select a set of sequence of arms • Limitations: can only deal with single click setting 6
  • 7. wwliu, Term Presentation, Term 1 Click Models oCascade Click Model: • Stop when first click occurs • Can only model single click oDependency Click Model: • Introduce a set of termination parameters • Can handle settings with multiple click 7 1 2 3
  • 8. wwliu, Term Presentation, Term 1 Dependency Click Model o Allow user continue to check more items after a click. o An extension of the Cascade Model • Can be reduced to CM if the termination weights ҧ𝑣 𝑘 = 1 8 Examine next item ak Attracted by the item? Would like to terminate? Reach the end of the list? Start Satisfied Not satisfied Yes Yes No Yes No No w(ak) v(k)¯ ¯
  • 9. wwliu, Term Presentation, Term 1 Problem Formulation o Given ground item set 𝐸 = 1, … , 𝐿 , a contextual vector 𝒙𝑖,𝑡 ∈ ℝd is known to the agent at time 𝑡. o Attraction weight 𝒘 𝑡 𝑎 ∈ 0,1 𝐸 • is 𝑤𝑡(𝑎)-biased Bernoulli r.m. • denotes whether user is attracted by 𝑎 or not. • the attraction weights 𝒘 𝑡 𝑎 𝑡=1 𝑛 are i.i.d o Termination weight 𝒗 𝑡 𝑘 ∈ 0,1 𝐾 • is ҧ𝑣(𝑘)-biased Bernoulli r.m. • denotes where user wants to terminate examining the list • only depends on the position 𝑘 • the termination weights 𝒗 𝑡 𝑘 𝑡=1 𝑛 are i.i.d 9 Recommended list 𝑨 𝑡 = (𝒂1 𝑡 , … , 𝒂 𝐾 𝑡 ) Feedback 010 ⋯ 100
  • 10. wwliu, Term Presentation, Term 1 Objective o The reward function is defined as 𝑓 𝐴, 𝑣, 𝑤 = 1 − ෑ 𝑘=1 𝐾 (1 − 𝑣 𝑘 𝑤(𝑎 𝑘)) indicating that 𝑓 𝑨 𝑡, 𝒗 𝑡, 𝒘 𝑡 = 1 if user clicks on a item, feels satisfied and terminates examination. o The pseudo-regret is defined as ℛ 𝑛 = 𝔼 ෍ 𝑡=1 𝑛 (𝑓 𝐴 𝑡 ∗ , 𝑣 𝑡, 𝑤𝑡 − 𝑓(𝐴 𝑡, 𝑣 𝑡, 𝑤𝑡)) 10
  • 11. wwliu, Term Presentation, Term 1 Partial Knowledge oClick sequence is the only feedback for the agent • The termination position is unobserved • The reward is not revealed 11 010011000 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 reward=1 reward=0 Use feedback before the last click to update the model
  • 12. wwliu, Term Presentation, Term 1 Proposed Model: attraction weight oAssume the expected attraction weight 𝑤𝑡(𝑎) follows 𝑤𝑡 𝑎 = 𝔼 𝒘 𝑡 𝑎 ℋ𝑡 = 𝜇(𝜃∗ ⊤ 𝑥 𝑡,𝑎) oUse the generalized linear model as a flexible extension • Admits a wider range of distributions, e.g. Gaussian, binomial, Poisson… 12 Attracted Or Not?
  • 13. wwliu, Term Presentation, Term 1 Proposed Model: termination weight o Due to the limited feedback, we assume the order of the expected termination weights are known • For simplicity of explanation, assume ҧ𝑣 1 ≥ ⋯ ≥ ҧ𝑣(𝐾) oThe expected reward is maximized by recommending the more attractive item to the higher position. 13 Terminate Or Not?
  • 14. wwliu, Term Presentation, Term 1 Proposed Model: parameter estimation o The model parameter 𝜃 can be estimated using MLE: ෍ 𝑠=1 𝑡 ෍ 𝑘=1 𝐶𝑡 𝑤𝑠 𝑎 𝑘 𝑠 − 𝜇 𝜃⊤ 𝑥 𝑠,𝑎 𝑘 𝑠 𝑥 𝑠,𝑎 𝑘 𝑠 = 0. o Upper Confidence Bound (UCB): 𝑈𝑡 𝑎 = min 𝜇 ෨𝜃𝑡−1 ⊤ 𝑥𝑡,𝑎 + 𝜌 𝑡 − 1 𝑥𝑡,𝑎 𝑉𝑡−1 −1 , 1 , where 𝛽𝑡 𝑎 𝛿 = 𝜌(𝑡) 𝑥𝑡,𝑎 𝑉𝑡 −1. 14 Lemma: For any 𝑡 ≥ 1 and 𝑎 ∈ 𝐸, denote 𝛽𝑡 𝑎 𝛿 = 2𝑘 𝜇 𝑐 𝜇 𝑥𝑡,𝑎 𝑉𝑡 −1 log 1 + 𝐾𝑡 𝜆𝑑 𝑑 𝛿2 . For all 0 ≤ 𝛿 ≤ 1, with probability at least 1 − 𝛿, it holds that: 𝜇 𝜃∗ ⊤ 𝑥𝑡,𝑎 − 𝜇 ෪𝜃𝑡 ⊤ 𝑥𝑡,𝑎 ≤ 𝛽𝑡 𝑎 𝛿 , ∀𝑡 ≥ 1.
  • 15. wwliu, Term Presentation, Term 1 Proposed Model: UCB oAnalyze mean and a measure of uncertainty (variance) for each item oMake decisions based on mean + variance 15 0 0.2 0.4 B C A
  • 16. wwliu, Term Presentation, Term 1 Proposed Model: UCB oThe value of 𝜌(𝑡) decreases w.r.t 𝑡 oThe uncertainty of 𝐴 reduces after several time step oAutomatically balances exploration and exploitation 16 0 0.2 0.4 B C A
  • 17. wwliu, Term Presentation, Term 1 Proposed Model: Algorithm 17 Recommend based on UCB Estimate 𝜃 Update statistics
  • 18. wwliu, Term Presentation, Term 1 Theoretical Results o The upper bound is of 𝑂(𝑑 𝑛 log 𝑛) for the regret, which depends linearly on the dimension 𝑑 of the feature space, but not on the number 𝐿 of base arms. 18 Theorem: If the reward function is given as 𝑓 𝐴, 𝑣, 𝑤 = 1 − Π 𝑘=1 𝐾 (1 − 𝑣 𝑘 𝑤(𝑎 𝑘)), then the cumulative regret ℛ(𝑛) of the proposed algorithm has the following bound, ℛ 𝑛 ≤ 4𝐾Δ 𝑣 𝑘 𝜇 𝑐 𝜇 𝑝∗ 𝑑𝑛 𝐾 + 1 log 1 + 𝐾𝑛 𝜆𝑑 𝑑 𝛿2 log 1 + 𝐾𝑛 𝜆𝑑 , where 𝑘 𝜇 is the Lipschitz constant, 𝑐 𝜇 = inf 𝜇 ′.
  • 19. wwliu, Term Presentation, Term 1 Experimental Results o Synthetic data • L=200, K=4 and d=10 • 𝜇 𝑥 = 1 1+exp −𝑥 oGL-CDCM outperforms KL- DCM by 80.27% and Lin- CDCM by 49.04%. 19
  • 20. wwliu, Term Presentation, Term 1 Experimental Results o Real-world data • 20M MovieLens data • L=200, K=5, d=100 o GL-CDCM is 5.69 times of that of KL-DCM and 1.45 times of that of Lin-CDCM 20
  • 21. wwliu, Term Presentation, Term 1 Conclusion oConclusion • Formulate the DCM bandits problem • Incorporate contextual information • Make a weaker assumption on the expected attraction weight function • Prove a upper regret bound oFuture work • Prove a tighter bound • Consider other practical click model • Verify the effectiveness using more real-world dataset 21