Submit Search
Upload
Real Time Learning
•
Download as PPTX, PDF
•
0 likes
•
1,665 views
T
Ted Dunning
Follow
A talk about real-time learning, especially using
Read less
Read more
Report
Share
Report
Share
1 of 33
Download now
Recommended
News from Mahout
News from Mahout
Ted Dunning
Devoxx Real-time Learning
Devoxx Real-time Learning
Ted Dunning
HR Strategy: What is it? Why do we need it?
HR Strategy: What is it? Why do we need it?
CreativeHRM
Devoxx Real-Time Learning
Devoxx Real-Time Learning
MapR Technologies
News From Mahout
News From Mahout
MapR Technologies
Cmu Lecture on Hadoop Performance
Cmu Lecture on Hadoop Performance
Ted Dunning
Strata new-york-2012
Strata new-york-2012
Ted Dunning
Mathematical bridges From Old to New
Mathematical bridges From Old to New
MapR Technologies
Recommended
News from Mahout
News from Mahout
Ted Dunning
Devoxx Real-time Learning
Devoxx Real-time Learning
Ted Dunning
HR Strategy: What is it? Why do we need it?
HR Strategy: What is it? Why do we need it?
CreativeHRM
Devoxx Real-Time Learning
Devoxx Real-Time Learning
MapR Technologies
News From Mahout
News From Mahout
MapR Technologies
Cmu Lecture on Hadoop Performance
Cmu Lecture on Hadoop Performance
Ted Dunning
Strata new-york-2012
Strata new-york-2012
Ted Dunning
Mathematical bridges From Old to New
Mathematical bridges From Old to New
MapR Technologies
Strata New York 2012
Strata New York 2012
MapR Technologies
Graphlab dunning-clustering
Graphlab dunning-clustering
Ted Dunning
Real-time and Long-time Together
Real-time and Long-time Together
MapR Technologies
Chicago finance-big-data
Chicago finance-big-data
Ted Dunning
Boston hug-2012-07
Boston hug-2012-07
Ted Dunning
Big data, why now?
Big data, why now?
Ted Dunning
CMU Lecture on Hadoop Performance
CMU Lecture on Hadoop Performance
MapR Technologies
New Directions for Mahout
New Directions for Mahout
Ted Dunning
London data science
London data science
Ted Dunning
How to find what you didn't know to look for, oractical anomaly detection
How to find what you didn't know to look for, oractical anomaly detection
DataWorks Summit
Super-Fast Clustering Report in MapR
Super-Fast Clustering Report in MapR
Data Science London
Microservices: The danger of overhype and importance of checklists
Microservices: The danger of overhype and importance of checklists
Reactivesummit
Machine Learning - What, Where and How
Machine Learning - What, Where and How
narinderk
Dunning - SIGMOD - Data Economy.pptx
Dunning - SIGMOD - Data Economy.pptx
Ted Dunning
How to Get Going with Kubernetes
How to Get Going with Kubernetes
Ted Dunning
Progress for big data in Kubernetes
Progress for big data in Kubernetes
Ted Dunning
Anomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look for
Ted Dunning
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine Learning
Ted Dunning
Machine Learning Logistics
Machine Learning Logistics
Ted Dunning
Tensor Abuse - how to reuse machine learning frameworks
Tensor Abuse - how to reuse machine learning frameworks
Ted Dunning
Machine Learning logistics
Machine Learning logistics
Ted Dunning
T digest-update
T digest-update
Ted Dunning
More Related Content
Similar to Real Time Learning
Strata New York 2012
Strata New York 2012
MapR Technologies
Graphlab dunning-clustering
Graphlab dunning-clustering
Ted Dunning
Real-time and Long-time Together
Real-time and Long-time Together
MapR Technologies
Chicago finance-big-data
Chicago finance-big-data
Ted Dunning
Boston hug-2012-07
Boston hug-2012-07
Ted Dunning
Big data, why now?
Big data, why now?
Ted Dunning
CMU Lecture on Hadoop Performance
CMU Lecture on Hadoop Performance
MapR Technologies
New Directions for Mahout
New Directions for Mahout
Ted Dunning
London data science
London data science
Ted Dunning
How to find what you didn't know to look for, oractical anomaly detection
How to find what you didn't know to look for, oractical anomaly detection
DataWorks Summit
Super-Fast Clustering Report in MapR
Super-Fast Clustering Report in MapR
Data Science London
Microservices: The danger of overhype and importance of checklists
Microservices: The danger of overhype and importance of checklists
Reactivesummit
Machine Learning - What, Where and How
Machine Learning - What, Where and How
narinderk
Similar to Real Time Learning
(13)
Strata New York 2012
Strata New York 2012
Graphlab dunning-clustering
Graphlab dunning-clustering
Real-time and Long-time Together
Real-time and Long-time Together
Chicago finance-big-data
Chicago finance-big-data
Boston hug-2012-07
Boston hug-2012-07
Big data, why now?
Big data, why now?
CMU Lecture on Hadoop Performance
CMU Lecture on Hadoop Performance
New Directions for Mahout
New Directions for Mahout
London data science
London data science
How to find what you didn't know to look for, oractical anomaly detection
How to find what you didn't know to look for, oractical anomaly detection
Super-Fast Clustering Report in MapR
Super-Fast Clustering Report in MapR
Microservices: The danger of overhype and importance of checklists
Microservices: The danger of overhype and importance of checklists
Machine Learning - What, Where and How
Machine Learning - What, Where and How
More from Ted Dunning
Dunning - SIGMOD - Data Economy.pptx
Dunning - SIGMOD - Data Economy.pptx
Ted Dunning
How to Get Going with Kubernetes
How to Get Going with Kubernetes
Ted Dunning
Progress for big data in Kubernetes
Progress for big data in Kubernetes
Ted Dunning
Anomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look for
Ted Dunning
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine Learning
Ted Dunning
Machine Learning Logistics
Machine Learning Logistics
Ted Dunning
Tensor Abuse - how to reuse machine learning frameworks
Tensor Abuse - how to reuse machine learning frameworks
Ted Dunning
Machine Learning logistics
Machine Learning logistics
Ted Dunning
T digest-update
T digest-update
Ted Dunning
Finding Changes in Real Data
Finding Changes in Real Data
Ted Dunning
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC Keynote
Ted Dunning
Real time-hadoop
Real time-hadoop
Ted Dunning
Cheap learning-dunning-9-18-2015
Cheap learning-dunning-9-18-2015
Ted Dunning
Sharing Sensitive Data Securely
Sharing Sensitive Data Securely
Ted Dunning
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Ted Dunning
How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside Down
Ted Dunning
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on Hadoop
Ted Dunning
Dunning time-series-2015
Dunning time-series-2015
Ted Dunning
Doing-the-impossible
Doing-the-impossible
Ted Dunning
Anomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine Learning
Ted Dunning
More from Ted Dunning
(20)
Dunning - SIGMOD - Data Economy.pptx
Dunning - SIGMOD - Data Economy.pptx
How to Get Going with Kubernetes
How to Get Going with Kubernetes
Progress for big data in Kubernetes
Progress for big data in Kubernetes
Anomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look for
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine Learning
Machine Learning Logistics
Machine Learning Logistics
Tensor Abuse - how to reuse machine learning frameworks
Tensor Abuse - how to reuse machine learning frameworks
Machine Learning logistics
Machine Learning logistics
T digest-update
T digest-update
Finding Changes in Real Data
Finding Changes in Real Data
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC Keynote
Real time-hadoop
Real time-hadoop
Cheap learning-dunning-9-18-2015
Cheap learning-dunning-9-18-2015
Sharing Sensitive Data Securely
Sharing Sensitive Data Securely
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside Down
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on Hadoop
Dunning time-series-2015
Dunning time-series-2015
Doing-the-impossible
Doing-the-impossible
Anomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine Learning
Real Time Learning
1.
Real-time Learning ©MapR Technologies
- Confidential 1
2.
Contact: – tdunning@maprtech.com – @ted_dunning Slides and such (available late tonight): – http://slideshare.net/tdunning Hash tags: #mapr #hivedata ©MapR Technologies - Confidential 2
3.
We have a
product to sell … from a web-site ©MapR Technologies - Confidential 3
4.
What tag-
What line? picture? Bogus Dog Food is the Best! Now available in handy 1 ton bags! Buy 5! What call to action? ©MapR Technologies - Confidential 4
5.
The Challenge
Design decisions affect probability of success – Cheesy web-sites don’t even sell cheese The best designers do better when allowed to fail – Exploration juices creativity But failing is expensive – If only because we could have succeeded – But also because offending or disappointing customers is bad ©MapR Technologies - Confidential 5
6.
More Challenges
Too many designs – 5 pictures – 10 tag-lines – 4 calls to action – 3 back-ground colors => 5 x 10 x 4 x 3 = 600 designs It gets worse quickly – What about changes on the back-end? – Search engine variants? – Checkout process variants? ©MapR Technologies - Confidential 6
7.
Example – AB
testing in real-time I have 15 versions of my landing page Each visitor is assigned to a version – Which version? A conversion or sale or whatever can happen – How long to wait? Some versions of the landing page are horrible – Don’t want to give them traffic ©MapR Technologies - Confidential 7
8.
A Quick Diversion
You see a coin – What is the probability of heads? – Could it be larger or smaller than that? I flip the coin and while it is in the air ask again I catch the coin and ask again I look at the coin (and you don’t) and ask again Why does the answer change? – And did it ever have a single value? ©MapR Technologies - Confidential 8
9.
A Philosophical Conclusion
Probability as expressed by humans is subjective and depends on information and experience ©MapR Technologies - Confidential 9
10.
I Dunno ©MapR Technologies
- Confidential 10
11.
5 heads out
of 10 throws ©MapR Technologies - Confidential 11
12.
2 heads out
of 12 throws ©MapR Technologies - Confidential 12
13.
So now you
understand Bayesian probability ©MapR Technologies - Confidential 13
14.
Another Quick Diversion
Let’s play a shell game This is a special shell game It costs you nothing to play The pea has constant probability of being under each shell (trust me) How do you find the best shell? How do you find it while maximizing the number of wins? ©MapR Technologies - Confidential 14
15.
Pause for short
con-game ©MapR Technologies - Confidential 15
16.
Interim Thoughts
Can you identify winners or losers without trying them out? Can you ever completely eliminate a shell with a bad streak? Should you keep trying apparent losers? ©MapR Technologies - Confidential 16
17.
Pause for second
con-game ©MapR Technologies - Confidential 17
18.
So now you
understand multi-armed bandits ©MapR Technologies - Confidential 18
19.
Conclusions
Can you identify winners or losers without trying them out? No Can you ever completely eliminate a shell with a bad streak? No Should you keep trying apparent losers? Yes, but at a decreasing rate ©MapR Technologies - Confidential 19
20.
Is there an
optimum strategy? ©MapR Technologies - Confidential 20
21.
Bayesian Bandit
Compute distributions based on data so far Sample p1, p2 and p2 from these distributions Pick shell i where i = argmaxi pi Lemma 1: The probability of picking shell i will match the probability it is the best shell Lemma 2: This is as good as it gets ©MapR Technologies - Confidential 21
22.
And it works!
0.12 0.11 0.1 0.09 0.08 0.07 regret 0.06 ε- greedy, ε = 0.05 0.05 0.04 Bayesian Bandit with Gam m a- Norm al 0.03 0.02 0.01 0 0 100 200 300 400 500 600 700 800 900 1000 1100 n ©MapR Technologies - Confidential 22
23.
Video Demo ©MapR Technologies
- Confidential 23
24.
The Code
Select an alternative n = dim(k)[1] p0 = rep(0, length.out=n) for (i in 1:n) { p0[i] = rbeta(1, k[i,2]+1, k[i,1]+1) } return (which(p0 == max(p0))) Select and learn for (z in 1:steps) { i = select(k) j = test(i) k[i,j] = k[i,j]+1 } return (k) But we already know how to count! ©MapR Technologies - Confidential 24
25.
The Basic Idea
We can encode a distribution by sampling Sampling allows unification of exploration and exploitation Can be extended to more general response models ©MapR Technologies - Confidential 25
26.
The Original Problem
x2 x1 Bogus Dog Food is the Best! Now available in handy 1 ton bags! Buy 5! x3 ©MapR Technologies - Confidential 26
27.
Response Function
æ ö p(win) = w çåqi xi ÷ è i ø 1 0.5 y 0 -6 -4 -2 0 2 4 6 x ©MapR Technologies - Confidential 27
28.
Generalized Banditry
Suppose we have an infinite number of bandits – suppose they are each labeled by two real numbers x and y in [0,1] – also that expected payoff is a parameterized function of x and y E [ z ] = f (x, y | q ) – now assume a distribution for θ that we can learn online Selection works by sampling θ, then computing f Learning works by propagating updates back to θ – If f is linear, this is very easy Don’t just have to have two labels, could have labels and context ©MapR Technologies - Confidential 28
29.
Context Variables
x2 x1 Bogus Dog Food is the Best! Now available in handy 1 ton bags! Buy 5! x3 user.geo env.time env.day_of_week env.weekend ©MapR Technologies - Confidential 29
30.
Caveats
Original Bayesian Bandit only requires real-time Generalized Bandit may require access to long history for learning – Pseudo online learning may be easier than true online Bandit variables can include content, time of day, day of week Context variables can include user id, user features Bandit × context variables provide the real power ©MapR Technologies - Confidential 30
31.
You can do
this yourself! ©MapR Technologies - Confidential 31
32.
Thank You ©MapR Technologies
- Confidential 32
33.
Contact: – tdunning@maprtech.com – @ted_dunning Slides and such (available late tonight): – http://slideshare.net/tdunning Hash tags: #mapr #hivedata ©MapR Technologies - Confidential 33
Download now