SlideShare a Scribd company logo
1
Multi-armed bandits
• At each time t pick arm i;
• get independent payoff ft with mean ui
• Classic model for exploration – exploitation tradeoff
• Extensively studied (Robbins ’52, Gittins ’79)
• Typically assume each arm is tried multiple times
• Goal: minimize regret
…
u1 u2 u3 uK
1
[ ]
t
T opt t
i
T E f
R 

  
2
Infinite-armed bandits
…
p1 p2 p3 pk
… p∞
p1 p2
…
In many applications, number of arms is huge
(sponsored search, sensor selection)
Cannot try each arm even once
Assumptions on payoff function f essential
Optimizing Noisy, Unknown Functions
• Given: Set of possible inputs D;
black-box access to unknown function f
• Want: Adaptive choice of inputs
from D maximizing
• Many applications: robotic control [Lizotte
et al. ’07], sponsored search [Pande &
Olston, ’07], clinical trials, …
• Sampling is expensive
• Algorithms evaluated using regret
Goal: minimize
Running example: Noisy Search
• How to find the hottest point in a building?
• Many noisy sensors available but sampling is expensive
• D: set of sensors; : temperature at chosen at step i
Observe
• Goal: Find with minimal number of queries
4
Relating to us: Active learning for PMF
A bandit setting for movie recommendation
Task: recommend movies for a new user
M-armed Bandit
Movie item as arm of bandit
For a new user i
At each round t, pick a movie j
Observe a rating Xij
Goal: maximize cumulative reward
sum of the ratings of all recommended movies
Model: PMF
X=UV+E, where
U: N*K matrix, V: K*M matrix, E: N*M matrix, zero-mean normal distributed
Assume movie feature V is fully observed. User feature Ui is unknown at first
Xi(j) = Ui Vj + ε (regard the ith row vector of X as a function Xi)
Xi(.): random linear function
5
Key insight: Exploit correlation
• Sampling f(x) at one point x yields information about f(x’)
for points x’ near x
• In this paper:
Model correlation using a Gaussian process (GP) prior for f
6
Temperature is
spatially correlated
Gaussian Processes to model payoff f
• Gaussian process (GP) = normal distribution over functions
• Finite marginals are multivariate Gaussians
• Closed form formulae for Bayesian posterior update exist
• Parameterized by covariance function K(x,x’) = Cov(f(x),f(x’))
7
Normal dist.
(1-D Gaussian)
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
-2
-1
0
1
2
0
0.1
0.2
0.3
0.4
Multivariate normal
(n-D Gaussian)
+
+
+
+
Gaussian process
(∞-D Gaussian)
8
Thinking about GPs
• Kernel function K(x, x’) specifies covariance
• Encodes smoothness assumptions
x
f(x)
P(f(x))
f(x)
9
Example of GPs
• Squared exponential kernel
K(x,x’) = exp(-(x-x’)2/h2)
0 0.2 0.4 0.6 0.8 1
-4
-3
-2
-1
0
1
2
Bandwidth h=.1
0 100 200 300 400 500 600 700
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Distance |x-x’|
0 0.2 0.4 0.6 0.8 1
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
3
Bandwidth h=.3
Samples from P(f)
-3 -2 -1 0 1 2 3
Gaussian process optimization
[e.g., Jones et al ’98]
10
x
f(x)
Goal: Adaptively pick
inputs such that
Key question: how should we pick samples?
So far, only heuristics:
Expected Improvement [Močkus et al. ‘78]
Most Probable Improvement [Močkus ‘89]
Used successfully in machine learning [Ginsbourger et al. ‘08,
Jones ‘01, Lizotte et al. ’07]
No theoretical guarantees on their regret!
11
Simple algorithm for GP optimization
• In each round t do:
• Pick
• Observe
• Use Bayes’ rule to get posterior mean
Can get stuck in local maxima!
11
x
f(x)
12
Uncertainty sampling
Pick:
That’s equivalent to (greedily) maximizing
information gain
Popular objective in Bayesian experimental design
(where the goal is pure exploration of f)
But…wastes samples by exploring f everywhere!
12
x
f(x)
Avoiding unnecessary samples
Key insight: Never need to sample where Upper
Confidence Bound (UCB) < best lower bound! 13
x
f(x)
Best lower
bound
14
Upper Confidence Bound (UCB) Algorithm
Naturally trades off explore and exploit; no samples wasted
Regret bounds: classic [Auer ’02] & linear f [Dani et al. ‘07]
But none in the GP optimization setting! (popular heuristic)
x
f(x)
Pick input that maximizes Upper Confidence Bound (UCB):
How should
we choose ¯t?
Need theory!
15
How well does UCB work?
• Intuitively, performance should depend on how
“learnable” the function is
15
“Easy” “Hard”
The quicker confidence bands collapse, the easier to learn
Key idea: Rate of collapse  growth of information gain
0 0.2 0.4 0.6 0.8 1
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
3
Bandwidth h=.3
0 0.2 0.4 0.6 0.8 1
-4
-3
-2
-1
0
1
2
Bandwidth h=.1
Learnability and information gain
• We show that regret bounds depend on how quickly we can
gain information
• Mathematically:
• Establishes a novel connection between GP optimization
and Bayesian experimental design
16
T
17
Performance of optimistic sampling
Theorem
If we choose ¯t = £(log t), then with high probability,
Hereby
The slower γT grows, the easier f is to learn
Key question: How quickly does γ T grow?
17
Maximal information gain
due to sampling!
Learnability and information gain
• Information gain exhibits diminishing returns (submodularity)
[Krause & Guestrin ’05]
• Our bounds depend on “rate” of diminishment
18
Little diminishing returns
Returns diminish fast
Dealing with high dimensions
Theorem: For various popular kernels, we have:
• Linear: ;
• Squared-exponential: ;
• Matérn with , ;
Smoothness of f helps battle curse of dimensionality!
Our bounds rely on submodularity of
19
What if f is not from a GP?
• In practice, f may not be Gaussian
Theorem: Let f lie in the RKHS of kernel K with ,
and let the noise be bounded almost surely by .
Choose .Then with high probab.,
• Frees us from knowing the “true prior”
• Intuitively, the bound depends on the “complexity” of
the function through its RKHS norm
20
Experiments: UCB vs. heuristics
• Temperature data
• 46 sensors deployed at Intel Research, Berkeley
• Collected data for 5 days (1 sample/minute)
• Want to adaptively find highest temperature as quickly as
possible
• Traffic data
• Speed data from 357 sensors deployed along highway I-880
South
• Collected during 6am-11am, for one month
• Want to find most congested (lowest speed) area as quickly as
possible
21
Comparison: UCB vs. heuristics
22
GP-UCB compares favorably with existing heuristics
23
Assumptions on f
Linear?
[Dani et al, ’07]
Lipschitz-continuous
(bounded slope)
[Kleinberg ‘08]
Fast convergence;
But strong assumption
Very flexible, but
Conclusions
• First theoretical guarantees and convergence rates
for GP optimization
• Both true prior and agnostic case covered
• Performance depends on “learnability”, captured by
maximal information gain
• Connects GP Bandit Optimization & Experimental Design!
• Performance on real data comparable to other heuristics
24

More Related Content

Similar to GAUSSIAN PRESENTATION.ppt

eviewsOLSMLE
eviewsOLSMLEeviewsOLSMLE
eviewsOLSMLE
Elmar Mertens
 
Seminar nov2017
Seminar nov2017Seminar nov2017
Seminar nov2017
Ahmed Youssef Ali Amer
 
Software Testing:
 A Research Travelogue 
(2000–2014)
Software Testing:
 A Research Travelogue 
(2000–2014)Software Testing:
 A Research Travelogue 
(2000–2014)
Software Testing:
 A Research Travelogue 
(2000–2014)
Alex Orso
 
AL slides.ppt
AL slides.pptAL slides.ppt
AL slides.ppt
ShehnazIslam1
 
Module_2_rks in Artificial intelligence in Expert System
Module_2_rks in Artificial intelligence in Expert SystemModule_2_rks in Artificial intelligence in Expert System
Module_2_rks in Artificial intelligence in Expert System
NareshKireedula
 
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory AccessAccelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Igor Sfiligoi
 
Deep Learning for Cyber Security
Deep Learning for Cyber SecurityDeep Learning for Cyber Security
Deep Learning for Cyber Security
Altoros
 
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
McSwathi
 
Clustering - ACM 2013 02-25
Clustering - ACM 2013 02-25Clustering - ACM 2013 02-25
Clustering - ACM 2013 02-25
MapR Technologies
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
Albert Bifet
 
Genetic programming
Genetic programmingGenetic programming
Genetic programming
Yun-Yan Chi
 
ObjRecog2-17 (1).pptx
ObjRecog2-17 (1).pptxObjRecog2-17 (1).pptx
ObjRecog2-17 (1).pptx
ssuserc074dd
 
Artificial Intelligence for Robotic AIFR
Artificial Intelligence for Robotic AIFRArtificial Intelligence for Robotic AIFR
Artificial Intelligence for Robotic AIFR
adityabishts894
 
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Junpei Kawamoto
 
Online machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsOnline machine learning in Streaming Applications
Online machine learning in Streaming Applications
Stavros Kontopoulos
 
Fuzzy logic
Fuzzy logicFuzzy logic
Fuzzy logic
Mahmoud Hussein
 
Theorem proving and the real numbers: overview and challenges
Theorem proving and the real numbers: overview and challengesTheorem proving and the real numbers: overview and challenges
Theorem proving and the real numbers: overview and challenges
Lawrence Paulson
 
Convex Hull Approximation of Nearly Optimal Lasso Solutions
Convex Hull Approximation of Nearly Optimal Lasso SolutionsConvex Hull Approximation of Nearly Optimal Lasso Solutions
Convex Hull Approximation of Nearly Optimal Lasso Solutions
Satoshi Hara
 
Practical-bayesian-optimization-of-machine-learning-algorithms_ver2
Practical-bayesian-optimization-of-machine-learning-algorithms_ver2Practical-bayesian-optimization-of-machine-learning-algorithms_ver2
Practical-bayesian-optimization-of-machine-learning-algorithms_ver2
Rohit Kumar Gupta
 
Machine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University ChhattisgarhMachine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University Chhattisgarh
Poorabpatel
 

Similar to GAUSSIAN PRESENTATION.ppt (20)

eviewsOLSMLE
eviewsOLSMLEeviewsOLSMLE
eviewsOLSMLE
 
Seminar nov2017
Seminar nov2017Seminar nov2017
Seminar nov2017
 
Software Testing:
 A Research Travelogue 
(2000–2014)
Software Testing:
 A Research Travelogue 
(2000–2014)Software Testing:
 A Research Travelogue 
(2000–2014)
Software Testing:
 A Research Travelogue 
(2000–2014)
 
AL slides.ppt
AL slides.pptAL slides.ppt
AL slides.ppt
 
Module_2_rks in Artificial intelligence in Expert System
Module_2_rks in Artificial intelligence in Expert SystemModule_2_rks in Artificial intelligence in Expert System
Module_2_rks in Artificial intelligence in Expert System
 
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory AccessAccelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
Accelerating Key Bioinformatics Tasks 100-fold by Improving Memory Access
 
Deep Learning for Cyber Security
Deep Learning for Cyber SecurityDeep Learning for Cyber Security
Deep Learning for Cyber Security
 
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
 
Clustering - ACM 2013 02-25
Clustering - ACM 2013 02-25Clustering - ACM 2013 02-25
Clustering - ACM 2013 02-25
 
Introduction to Big Data Science
Introduction to Big Data ScienceIntroduction to Big Data Science
Introduction to Big Data Science
 
Genetic programming
Genetic programmingGenetic programming
Genetic programming
 
ObjRecog2-17 (1).pptx
ObjRecog2-17 (1).pptxObjRecog2-17 (1).pptx
ObjRecog2-17 (1).pptx
 
Artificial Intelligence for Robotic AIFR
Artificial Intelligence for Robotic AIFRArtificial Intelligence for Robotic AIFR
Artificial Intelligence for Robotic AIFR
 
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
Frequency-based Constraint Relaxation for Private Query Processing in Cloud D...
 
Online machine learning in Streaming Applications
Online machine learning in Streaming ApplicationsOnline machine learning in Streaming Applications
Online machine learning in Streaming Applications
 
Fuzzy logic
Fuzzy logicFuzzy logic
Fuzzy logic
 
Theorem proving and the real numbers: overview and challenges
Theorem proving and the real numbers: overview and challengesTheorem proving and the real numbers: overview and challenges
Theorem proving and the real numbers: overview and challenges
 
Convex Hull Approximation of Nearly Optimal Lasso Solutions
Convex Hull Approximation of Nearly Optimal Lasso SolutionsConvex Hull Approximation of Nearly Optimal Lasso Solutions
Convex Hull Approximation of Nearly Optimal Lasso Solutions
 
Practical-bayesian-optimization-of-machine-learning-algorithms_ver2
Practical-bayesian-optimization-of-machine-learning-algorithms_ver2Practical-bayesian-optimization-of-machine-learning-algorithms_ver2
Practical-bayesian-optimization-of-machine-learning-algorithms_ver2
 
Machine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University ChhattisgarhMachine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University Chhattisgarh
 

Recently uploaded

Summary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdfSummary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdf
vadgavevedant86
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
yourprojectpartner05
 
Flow chart.pdf LIFE SCIENCES CSIR UGC NET CONTENT
Flow chart.pdf  LIFE SCIENCES CSIR UGC NET CONTENTFlow chart.pdf  LIFE SCIENCES CSIR UGC NET CONTENT
Flow chart.pdf LIFE SCIENCES CSIR UGC NET CONTENT
savindersingh16
 
fermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptxfermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptx
ananya23nair
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Selcen Ozturkcan
 
Sustainable Land Management - Climate Smart Agriculture
Sustainable Land Management - Climate Smart AgricultureSustainable Land Management - Climate Smart Agriculture
Sustainable Land Management - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
Gadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdfGadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdf
PirithiRaju
 
Immunotherapy presentation from clinical immunology
Immunotherapy presentation from clinical immunologyImmunotherapy presentation from clinical immunology
Immunotherapy presentation from clinical immunology
VetriVel359477
 
Injection: Risks and challenges - Injection of CO2 into geological rock forma...
Injection: Risks and challenges - Injection of CO2 into geological rock forma...Injection: Risks and challenges - Injection of CO2 into geological rock forma...
Injection: Risks and challenges - Injection of CO2 into geological rock forma...
Oeko-Institut
 
Introduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptxIntroduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptx
QusayMaghayerh
 
Physiology of Nervous System presentation.pptx
Physiology of Nervous System presentation.pptxPhysiology of Nervous System presentation.pptx
Physiology of Nervous System presentation.pptx
fatima132662
 
Signatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coastsSignatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coasts
Sérgio Sacani
 
gastroretentive drug delivery system-PPT.pptx
gastroretentive drug delivery system-PPT.pptxgastroretentive drug delivery system-PPT.pptx
gastroretentive drug delivery system-PPT.pptx
Shekar Boddu
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
PirithiRaju
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
frank0071
 
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE  AND ITS BENIFITS.pptxIMPORTANCE OF ALGAE  AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
OmAle5
 
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
ABHISHEK SONI NIMT INSTITUTE OF MEDICAL AND PARAMEDCIAL SCIENCES , GOVT PG COLLEGE NOIDA
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
PirithiRaju
 
BIRDS DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
BIRDS  DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptxBIRDS  DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
BIRDS DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
goluk9330
 

Recently uploaded (20)

Summary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdfSummary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdf
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
 
Flow chart.pdf LIFE SCIENCES CSIR UGC NET CONTENT
Flow chart.pdf  LIFE SCIENCES CSIR UGC NET CONTENTFlow chart.pdf  LIFE SCIENCES CSIR UGC NET CONTENT
Flow chart.pdf LIFE SCIENCES CSIR UGC NET CONTENT
 
fermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptxfermented food science of sauerkraut.pptx
fermented food science of sauerkraut.pptx
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
 
Sustainable Land Management - Climate Smart Agriculture
Sustainable Land Management - Climate Smart AgricultureSustainable Land Management - Climate Smart Agriculture
Sustainable Land Management - Climate Smart Agriculture
 
Gadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdfGadgets for management of stored product pests_Dr.UPR.pdf
Gadgets for management of stored product pests_Dr.UPR.pdf
 
Immunotherapy presentation from clinical immunology
Immunotherapy presentation from clinical immunologyImmunotherapy presentation from clinical immunology
Immunotherapy presentation from clinical immunology
 
Injection: Risks and challenges - Injection of CO2 into geological rock forma...
Injection: Risks and challenges - Injection of CO2 into geological rock forma...Injection: Risks and challenges - Injection of CO2 into geological rock forma...
Injection: Risks and challenges - Injection of CO2 into geological rock forma...
 
Introduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptxIntroduction_Ch_01_Biotech Biotechnology course .pptx
Introduction_Ch_01_Biotech Biotechnology course .pptx
 
Physiology of Nervous System presentation.pptx
Physiology of Nervous System presentation.pptxPhysiology of Nervous System presentation.pptx
Physiology of Nervous System presentation.pptx
 
Signatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coastsSignatures of wave erosion in Titan’s coasts
Signatures of wave erosion in Titan’s coasts
 
gastroretentive drug delivery system-PPT.pptx
gastroretentive drug delivery system-PPT.pptxgastroretentive drug delivery system-PPT.pptx
gastroretentive drug delivery system-PPT.pptx
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
 
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE  AND ITS BENIFITS.pptxIMPORTANCE OF ALGAE  AND ITS BENIFITS.pptx
IMPORTANCE OF ALGAE AND ITS BENIFITS.pptx
 
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
MICROBIAL INTERACTION PPT/ MICROBIAL INTERACTION AND THEIR TYPES // PLANT MIC...
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
 
BIRDS DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
BIRDS  DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptxBIRDS  DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
BIRDS DIVERSITY OF SOOTEA BISWANATH ASSAM.ppt.pptx
 

GAUSSIAN PRESENTATION.ppt

  • 1. 1 Multi-armed bandits • At each time t pick arm i; • get independent payoff ft with mean ui • Classic model for exploration – exploitation tradeoff • Extensively studied (Robbins ’52, Gittins ’79) • Typically assume each arm is tried multiple times • Goal: minimize regret … u1 u2 u3 uK 1 [ ] t T opt t i T E f R     
  • 2. 2 Infinite-armed bandits … p1 p2 p3 pk … p∞ p1 p2 … In many applications, number of arms is huge (sponsored search, sensor selection) Cannot try each arm even once Assumptions on payoff function f essential
  • 3. Optimizing Noisy, Unknown Functions • Given: Set of possible inputs D; black-box access to unknown function f • Want: Adaptive choice of inputs from D maximizing • Many applications: robotic control [Lizotte et al. ’07], sponsored search [Pande & Olston, ’07], clinical trials, … • Sampling is expensive • Algorithms evaluated using regret Goal: minimize
  • 4. Running example: Noisy Search • How to find the hottest point in a building? • Many noisy sensors available but sampling is expensive • D: set of sensors; : temperature at chosen at step i Observe • Goal: Find with minimal number of queries 4
  • 5. Relating to us: Active learning for PMF A bandit setting for movie recommendation Task: recommend movies for a new user M-armed Bandit Movie item as arm of bandit For a new user i At each round t, pick a movie j Observe a rating Xij Goal: maximize cumulative reward sum of the ratings of all recommended movies Model: PMF X=UV+E, where U: N*K matrix, V: K*M matrix, E: N*M matrix, zero-mean normal distributed Assume movie feature V is fully observed. User feature Ui is unknown at first Xi(j) = Ui Vj + ε (regard the ith row vector of X as a function Xi) Xi(.): random linear function 5
  • 6. Key insight: Exploit correlation • Sampling f(x) at one point x yields information about f(x’) for points x’ near x • In this paper: Model correlation using a Gaussian process (GP) prior for f 6 Temperature is spatially correlated
  • 7. Gaussian Processes to model payoff f • Gaussian process (GP) = normal distribution over functions • Finite marginals are multivariate Gaussians • Closed form formulae for Bayesian posterior update exist • Parameterized by covariance function K(x,x’) = Cov(f(x),f(x’)) 7 Normal dist. (1-D Gaussian) -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1 0 1 2 0 0.1 0.2 0.3 0.4 Multivariate normal (n-D Gaussian) + + + + Gaussian process (∞-D Gaussian)
  • 8. 8 Thinking about GPs • Kernel function K(x, x’) specifies covariance • Encodes smoothness assumptions x f(x) P(f(x)) f(x)
  • 9. 9 Example of GPs • Squared exponential kernel K(x,x’) = exp(-(x-x’)2/h2) 0 0.2 0.4 0.6 0.8 1 -4 -3 -2 -1 0 1 2 Bandwidth h=.1 0 100 200 300 400 500 600 700 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Distance |x-x’| 0 0.2 0.4 0.6 0.8 1 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 Bandwidth h=.3 Samples from P(f) -3 -2 -1 0 1 2 3
  • 10. Gaussian process optimization [e.g., Jones et al ’98] 10 x f(x) Goal: Adaptively pick inputs such that Key question: how should we pick samples? So far, only heuristics: Expected Improvement [Močkus et al. ‘78] Most Probable Improvement [Močkus ‘89] Used successfully in machine learning [Ginsbourger et al. ‘08, Jones ‘01, Lizotte et al. ’07] No theoretical guarantees on their regret!
  • 11. 11 Simple algorithm for GP optimization • In each round t do: • Pick • Observe • Use Bayes’ rule to get posterior mean Can get stuck in local maxima! 11 x f(x)
  • 12. 12 Uncertainty sampling Pick: That’s equivalent to (greedily) maximizing information gain Popular objective in Bayesian experimental design (where the goal is pure exploration of f) But…wastes samples by exploring f everywhere! 12 x f(x)
  • 13. Avoiding unnecessary samples Key insight: Never need to sample where Upper Confidence Bound (UCB) < best lower bound! 13 x f(x) Best lower bound
  • 14. 14 Upper Confidence Bound (UCB) Algorithm Naturally trades off explore and exploit; no samples wasted Regret bounds: classic [Auer ’02] & linear f [Dani et al. ‘07] But none in the GP optimization setting! (popular heuristic) x f(x) Pick input that maximizes Upper Confidence Bound (UCB): How should we choose ¯t? Need theory!
  • 15. 15 How well does UCB work? • Intuitively, performance should depend on how “learnable” the function is 15 “Easy” “Hard” The quicker confidence bands collapse, the easier to learn Key idea: Rate of collapse  growth of information gain 0 0.2 0.4 0.6 0.8 1 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 Bandwidth h=.3 0 0.2 0.4 0.6 0.8 1 -4 -3 -2 -1 0 1 2 Bandwidth h=.1
  • 16. Learnability and information gain • We show that regret bounds depend on how quickly we can gain information • Mathematically: • Establishes a novel connection between GP optimization and Bayesian experimental design 16 T
  • 17. 17 Performance of optimistic sampling Theorem If we choose ¯t = £(log t), then with high probability, Hereby The slower γT grows, the easier f is to learn Key question: How quickly does γ T grow? 17 Maximal information gain due to sampling!
  • 18. Learnability and information gain • Information gain exhibits diminishing returns (submodularity) [Krause & Guestrin ’05] • Our bounds depend on “rate” of diminishment 18 Little diminishing returns Returns diminish fast
  • 19. Dealing with high dimensions Theorem: For various popular kernels, we have: • Linear: ; • Squared-exponential: ; • Matérn with , ; Smoothness of f helps battle curse of dimensionality! Our bounds rely on submodularity of 19
  • 20. What if f is not from a GP? • In practice, f may not be Gaussian Theorem: Let f lie in the RKHS of kernel K with , and let the noise be bounded almost surely by . Choose .Then with high probab., • Frees us from knowing the “true prior” • Intuitively, the bound depends on the “complexity” of the function through its RKHS norm 20
  • 21. Experiments: UCB vs. heuristics • Temperature data • 46 sensors deployed at Intel Research, Berkeley • Collected data for 5 days (1 sample/minute) • Want to adaptively find highest temperature as quickly as possible • Traffic data • Speed data from 357 sensors deployed along highway I-880 South • Collected during 6am-11am, for one month • Want to find most congested (lowest speed) area as quickly as possible 21
  • 22. Comparison: UCB vs. heuristics 22 GP-UCB compares favorably with existing heuristics
  • 23. 23 Assumptions on f Linear? [Dani et al, ’07] Lipschitz-continuous (bounded slope) [Kleinberg ‘08] Fast convergence; But strong assumption Very flexible, but
  • 24. Conclusions • First theoretical guarantees and convergence rates for GP optimization • Both true prior and agnostic case covered • Performance depends on “learnability”, captured by maximal information gain • Connects GP Bandit Optimization & Experimental Design! • Performance on real data comparable to other heuristics 24

Editor's Notes

  1. Explanation of k-armed bandit ! 
  2. Repeat what f is – give an example !
  3. Floorplan looks funny (pixelated)
  4. Floorplan looks funny (pixelated)
  5. Add cartoon plot for \gamma_T; need axes, etc.