SlideShare a Scribd company logo
1 of 33
Real-time Learning

©MapR Technologies - Confidential       1
     Contact:
       –   tdunning@maprtech.com
       –   @ted_dunning


     Slides and such (available late tonight):
       –   http://slideshare.net/tdunning


     Hash tags: #mapr #hivedata




©MapR Technologies - Confidential           2
We have a product
                             to sell …
                                    from a web-site


©MapR Technologies - Confidential        3
What tag-
                               What                                 line?
                              picture?
                                                  Bogus Dog Food is the Best!
                                                  Now available in handy 1 ton
                                                  bags!



                                         Buy 5!




                                                    What call to
                                                     action?




©MapR Technologies - Confidential                           4
The Challenge

     Design decisions affect probability of success
       –   Cheesy web-sites don’t even sell cheese



     The best designers do better when allowed to fail
       –   Exploration juices creativity




     But failing is expensive
       –   If only because we could have succeeded
       –   But also because offending or disappointing customers is bad


©MapR Technologies - Confidential            5
More Challenges

     Too many designs
       – 5 pictures
       – 10 tag-lines
       – 4 calls to action
       – 3 back-ground colors
       => 5 x 10 x 4 x 3 = 600 designs


     It gets worse quickly
       –   What about changes on the back-end?
       –   Search engine variants?
       –   Checkout process variants?



©MapR Technologies - Confidential         6
Example – AB testing in real-time

     I have 15 versions of my landing page
     Each visitor is assigned to a version
       –   Which version?
     A conversion or sale or whatever can happen
       –   How long to wait?
     Some versions of the landing page are horrible
       –   Don’t want to give them traffic




©MapR Technologies - Confidential            7
A Quick Diversion

     You see a coin
       –   What is the probability of heads?
       –   Could it be larger or smaller than that?
     I flip the coin and while it is in the air ask again
     I catch the coin and ask again
     I look at the coin (and you don’t) and ask again
     Why does the answer change?
       –   And did it ever have a single value?




©MapR Technologies - Confidential                 8
A Philosophical Conclusion

     Probability as expressed by humans is subjective and depends on
      information and experience




©MapR Technologies - Confidential    9
I Dunno




©MapR Technologies - Confidential   10
5 heads out of 10 throws




©MapR Technologies - Confidential   11
2 heads out of 12 throws




©MapR Technologies - Confidential   12
So now you understand
                   Bayesian probability



©MapR Technologies - Confidential   13
Another Quick Diversion

     Let’s play a shell game
     This is a special shell game
     It costs you nothing to play
     The pea has constant probability of being under each shell
           (trust me)
     How do you find the best shell?
     How do you find it while maximizing the number of wins?




©MapR Technologies - Confidential       14
Pause for short
                                    con-game



©MapR Technologies - Confidential          15
Interim Thoughts

     Can you identify winners or losers without trying them out?


     Can you ever completely eliminate a shell with a bad streak?


     Should you keep trying apparent losers?




©MapR Technologies - Confidential    16
Pause for second
                                    con-game



©MapR Technologies - Confidential          17
So now you understand
                   multi-armed bandits



©MapR Technologies - Confidential   18
Conclusions

     Can you identify winners or losers without trying them out?
       No


     Can you ever completely eliminate a shell with a bad streak?
       No


     Should you keep trying apparent losers?
       Yes, but at a decreasing rate




©MapR Technologies - Confidential      19
Is there an optimum
                   strategy?



©MapR Technologies - Confidential   20
Bayesian Bandit

     Compute distributions based on data so far
     Sample p1, p2 and p2 from these distributions
     Pick shell i where i = argmaxi pi


     Lemma 1: The probability of picking shell i will match the
      probability it is the best shell


     Lemma 2: This is as good as it gets




©MapR Technologies - Confidential         21
And it works!

                                    0.12


                                    0.11


                                     0.1


                                    0.09


                                    0.08


                                    0.07
                           regret




                                    0.06
                                                                 ε- greedy, ε = 0.05
                                    0.05


                                    0.04                                               Bayesian Bandit with Gam m a- Norm al
                                    0.03


                                    0.02


                                    0.01


                                      0
                                           0   100   200   300       400    500        600    700    800    900    1000   1100

                                                                                   n




©MapR Technologies - Confidential                                                 22
Video Demo




©MapR Technologies - Confidential       23
The Code

     Select an alternative
                   n = dim(k)[1]
                   p0 = rep(0, length.out=n)
                   for (i in 1:n) {
                     p0[i] = rbeta(1, k[i,2]+1, k[i,1]+1)
                   }
                   return (which(p0 == max(p0)))


     Select and learn
                  for (z in 1:steps) {
                     i = select(k)
                     j = test(i)
                     k[i,j] = k[i,j]+1
                   }
                   return (k)




     But we already know how to count!


©MapR Technologies - Confidential                           24
The Basic Idea

     We can encode a distribution by sampling
     Sampling allows unification of exploration and exploitation


     Can be extended to more general response models




©MapR Technologies - Confidential     25
The Original Problem

                                                                      x2
                                    x1

                                                  Bogus Dog Food is the Best!
                                                  Now available in handy 1 ton
                                                  bags!



                                         Buy 5!




                                                        x3




©MapR Technologies - Confidential                            26
Response Function


                                                        æ       ö
                                             p(win) = w çåqi xi ÷
                                                        è i     ø
                                                        1




                                                       0.5
                                    y




                                                        0
                                        -6   -4   -2         0   2   4   6
                                                             x




©MapR Technologies - Confidential                      27
Generalized Banditry

     Suppose we have an infinite number of bandits
       –   suppose they are each labeled by two real numbers x and y in [0,1]
       –   also that expected payoff is a parameterized function of x and y

                                    E [ z ] = f (x, y | q )
       –   now assume a distribution for θ that we can learn online
     Selection works by sampling θ, then computing f
     Learning works by propagating updates back to θ
       –   If f is linear, this is very easy
     Don’t just have to have two labels, could have labels and context



©MapR Technologies - Confidential                    28
Context Variables

                                                                         x2
                                    x1

                                                     Bogus Dog Food is the Best!
                                                     Now available in handy 1 ton
                                                     bags!



                                            Buy 5!




                                                           x3


               user.geo                  env.time          env.day_of_week          env.weekend


©MapR Technologies - Confidential                               29
Caveats

     Original Bayesian Bandit only requires real-time


     Generalized Bandit may require access to long history for learning
       –   Pseudo online learning may be easier than true online


     Bandit variables can include content, time of day, day of week


     Context variables can include user id, user features


     Bandit × context variables provide the real power


©MapR Technologies - Confidential            30
You can do this
                                       yourself!



©MapR Technologies - Confidential         31
Thank You




©MapR Technologies - Confidential   32
     Contact:
       –   tdunning@maprtech.com
       –   @ted_dunning


     Slides and such (available late tonight):
       –   http://slideshare.net/tdunning


     Hash tags: #mapr #hivedata




©MapR Technologies - Confidential           33

More Related Content

Similar to Real Time Learning

Graphlab dunning-clustering
Graphlab dunning-clusteringGraphlab dunning-clustering
Graphlab dunning-clusteringTed Dunning
 
Real-time and Long-time Together
Real-time and Long-time TogetherReal-time and Long-time Together
Real-time and Long-time TogetherMapR Technologies
 
Chicago finance-big-data
Chicago finance-big-dataChicago finance-big-data
Chicago finance-big-dataTed Dunning
 
Boston hug-2012-07
Boston hug-2012-07Boston hug-2012-07
Boston hug-2012-07Ted Dunning
 
Big data, why now?
Big data, why now?Big data, why now?
Big data, why now?Ted Dunning
 
CMU Lecture on Hadoop Performance
CMU Lecture on Hadoop PerformanceCMU Lecture on Hadoop Performance
CMU Lecture on Hadoop PerformanceMapR Technologies
 
New Directions for Mahout
New Directions for MahoutNew Directions for Mahout
New Directions for MahoutTed Dunning
 
London data science
London data scienceLondon data science
London data scienceTed Dunning
 
How to find what you didn't know to look for, oractical anomaly detection
How to find what you didn't know to look for, oractical anomaly detectionHow to find what you didn't know to look for, oractical anomaly detection
How to find what you didn't know to look for, oractical anomaly detectionDataWorks Summit
 
Super-Fast Clustering Report in MapR
Super-Fast Clustering Report in MapRSuper-Fast Clustering Report in MapR
Super-Fast Clustering Report in MapRData Science London
 
Microservices: The danger of overhype and importance of checklists
Microservices: The danger of overhype and importance of checklistsMicroservices: The danger of overhype and importance of checklists
Microservices: The danger of overhype and importance of checklistsReactivesummit
 
Machine Learning - What, Where and How
Machine Learning - What, Where and HowMachine Learning - What, Where and How
Machine Learning - What, Where and Hownarinderk
 

Similar to Real Time Learning (13)

Strata New York 2012
Strata New York 2012Strata New York 2012
Strata New York 2012
 
Graphlab dunning-clustering
Graphlab dunning-clusteringGraphlab dunning-clustering
Graphlab dunning-clustering
 
Real-time and Long-time Together
Real-time and Long-time TogetherReal-time and Long-time Together
Real-time and Long-time Together
 
Chicago finance-big-data
Chicago finance-big-dataChicago finance-big-data
Chicago finance-big-data
 
Boston hug-2012-07
Boston hug-2012-07Boston hug-2012-07
Boston hug-2012-07
 
Big data, why now?
Big data, why now?Big data, why now?
Big data, why now?
 
CMU Lecture on Hadoop Performance
CMU Lecture on Hadoop PerformanceCMU Lecture on Hadoop Performance
CMU Lecture on Hadoop Performance
 
New Directions for Mahout
New Directions for MahoutNew Directions for Mahout
New Directions for Mahout
 
London data science
London data scienceLondon data science
London data science
 
How to find what you didn't know to look for, oractical anomaly detection
How to find what you didn't know to look for, oractical anomaly detectionHow to find what you didn't know to look for, oractical anomaly detection
How to find what you didn't know to look for, oractical anomaly detection
 
Super-Fast Clustering Report in MapR
Super-Fast Clustering Report in MapRSuper-Fast Clustering Report in MapR
Super-Fast Clustering Report in MapR
 
Microservices: The danger of overhype and importance of checklists
Microservices: The danger of overhype and importance of checklistsMicroservices: The danger of overhype and importance of checklists
Microservices: The danger of overhype and importance of checklists
 
Machine Learning - What, Where and How
Machine Learning - What, Where and HowMachine Learning - What, Where and How
Machine Learning - What, Where and How
 

More from Ted Dunning

Dunning - SIGMOD - Data Economy.pptx
Dunning - SIGMOD - Data Economy.pptxDunning - SIGMOD - Data Economy.pptx
Dunning - SIGMOD - Data Economy.pptxTed Dunning
 
How to Get Going with Kubernetes
How to Get Going with KubernetesHow to Get Going with Kubernetes
How to Get Going with KubernetesTed Dunning
 
Progress for big data in Kubernetes
Progress for big data in KubernetesProgress for big data in Kubernetes
Progress for big data in KubernetesTed Dunning
 
Anomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look forAnomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look forTed Dunning
 
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningStreaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningTed Dunning
 
Machine Learning Logistics
Machine Learning LogisticsMachine Learning Logistics
Machine Learning LogisticsTed Dunning
 
Tensor Abuse - how to reuse machine learning frameworks
Tensor Abuse - how to reuse machine learning frameworksTensor Abuse - how to reuse machine learning frameworks
Tensor Abuse - how to reuse machine learning frameworksTed Dunning
 
Machine Learning logistics
Machine Learning logisticsMachine Learning logistics
Machine Learning logisticsTed Dunning
 
Finding Changes in Real Data
Finding Changes in Real DataFinding Changes in Real Data
Finding Changes in Real DataTed Dunning
 
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteWhere is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteTed Dunning
 
Real time-hadoop
Real time-hadoopReal time-hadoop
Real time-hadoopTed Dunning
 
Cheap learning-dunning-9-18-2015
Cheap learning-dunning-9-18-2015Cheap learning-dunning-9-18-2015
Cheap learning-dunning-9-18-2015Ted Dunning
 
Sharing Sensitive Data Securely
Sharing Sensitive Data SecurelySharing Sensitive Data Securely
Sharing Sensitive Data SecurelyTed Dunning
 
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-timeReal-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-timeTed Dunning
 
How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownHow the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownTed Dunning
 
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on HadoopApache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on HadoopTed Dunning
 
Dunning time-series-2015
Dunning time-series-2015Dunning time-series-2015
Dunning time-series-2015Ted Dunning
 
Doing-the-impossible
Doing-the-impossibleDoing-the-impossible
Doing-the-impossibleTed Dunning
 
Anomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine LearningAnomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine LearningTed Dunning
 

More from Ted Dunning (20)

Dunning - SIGMOD - Data Economy.pptx
Dunning - SIGMOD - Data Economy.pptxDunning - SIGMOD - Data Economy.pptx
Dunning - SIGMOD - Data Economy.pptx
 
How to Get Going with Kubernetes
How to Get Going with KubernetesHow to Get Going with Kubernetes
How to Get Going with Kubernetes
 
Progress for big data in Kubernetes
Progress for big data in KubernetesProgress for big data in Kubernetes
Progress for big data in Kubernetes
 
Anomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look forAnomaly Detection: How to find what you didn’t know to look for
Anomaly Detection: How to find what you didn’t know to look for
 
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningStreaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine Learning
 
Machine Learning Logistics
Machine Learning LogisticsMachine Learning Logistics
Machine Learning Logistics
 
Tensor Abuse - how to reuse machine learning frameworks
Tensor Abuse - how to reuse machine learning frameworksTensor Abuse - how to reuse machine learning frameworks
Tensor Abuse - how to reuse machine learning frameworks
 
Machine Learning logistics
Machine Learning logisticsMachine Learning logistics
Machine Learning logistics
 
T digest-update
T digest-updateT digest-update
T digest-update
 
Finding Changes in Real Data
Finding Changes in Real DataFinding Changes in Real Data
Finding Changes in Real Data
 
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteWhere is Data Going? - RMDC Keynote
Where is Data Going? - RMDC Keynote
 
Real time-hadoop
Real time-hadoopReal time-hadoop
Real time-hadoop
 
Cheap learning-dunning-9-18-2015
Cheap learning-dunning-9-18-2015Cheap learning-dunning-9-18-2015
Cheap learning-dunning-9-18-2015
 
Sharing Sensitive Data Securely
Sharing Sensitive Data SecurelySharing Sensitive Data Securely
Sharing Sensitive Data Securely
 
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-timeReal-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
 
How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside DownHow the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside Down
 
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on HadoopApache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on Hadoop
 
Dunning time-series-2015
Dunning time-series-2015Dunning time-series-2015
Dunning time-series-2015
 
Doing-the-impossible
Doing-the-impossibleDoing-the-impossible
Doing-the-impossible
 
Anomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine LearningAnomaly Detection - New York Machine Learning
Anomaly Detection - New York Machine Learning
 

Real Time Learning

  • 2. Contact: – tdunning@maprtech.com – @ted_dunning  Slides and such (available late tonight): – http://slideshare.net/tdunning  Hash tags: #mapr #hivedata ©MapR Technologies - Confidential 2
  • 3. We have a product to sell … from a web-site ©MapR Technologies - Confidential 3
  • 4. What tag- What line? picture? Bogus Dog Food is the Best! Now available in handy 1 ton bags! Buy 5! What call to action? ©MapR Technologies - Confidential 4
  • 5. The Challenge  Design decisions affect probability of success – Cheesy web-sites don’t even sell cheese  The best designers do better when allowed to fail – Exploration juices creativity  But failing is expensive – If only because we could have succeeded – But also because offending or disappointing customers is bad ©MapR Technologies - Confidential 5
  • 6. More Challenges  Too many designs – 5 pictures – 10 tag-lines – 4 calls to action – 3 back-ground colors => 5 x 10 x 4 x 3 = 600 designs  It gets worse quickly – What about changes on the back-end? – Search engine variants? – Checkout process variants? ©MapR Technologies - Confidential 6
  • 7. Example – AB testing in real-time  I have 15 versions of my landing page  Each visitor is assigned to a version – Which version?  A conversion or sale or whatever can happen – How long to wait?  Some versions of the landing page are horrible – Don’t want to give them traffic ©MapR Technologies - Confidential 7
  • 8. A Quick Diversion  You see a coin – What is the probability of heads? – Could it be larger or smaller than that?  I flip the coin and while it is in the air ask again  I catch the coin and ask again  I look at the coin (and you don’t) and ask again  Why does the answer change? – And did it ever have a single value? ©MapR Technologies - Confidential 8
  • 9. A Philosophical Conclusion  Probability as expressed by humans is subjective and depends on information and experience ©MapR Technologies - Confidential 9
  • 10. I Dunno ©MapR Technologies - Confidential 10
  • 11. 5 heads out of 10 throws ©MapR Technologies - Confidential 11
  • 12. 2 heads out of 12 throws ©MapR Technologies - Confidential 12
  • 13. So now you understand Bayesian probability ©MapR Technologies - Confidential 13
  • 14. Another Quick Diversion  Let’s play a shell game  This is a special shell game  It costs you nothing to play  The pea has constant probability of being under each shell (trust me)  How do you find the best shell?  How do you find it while maximizing the number of wins? ©MapR Technologies - Confidential 14
  • 15. Pause for short con-game ©MapR Technologies - Confidential 15
  • 16. Interim Thoughts  Can you identify winners or losers without trying them out?  Can you ever completely eliminate a shell with a bad streak?  Should you keep trying apparent losers? ©MapR Technologies - Confidential 16
  • 17. Pause for second con-game ©MapR Technologies - Confidential 17
  • 18. So now you understand multi-armed bandits ©MapR Technologies - Confidential 18
  • 19. Conclusions  Can you identify winners or losers without trying them out? No  Can you ever completely eliminate a shell with a bad streak? No  Should you keep trying apparent losers? Yes, but at a decreasing rate ©MapR Technologies - Confidential 19
  • 20. Is there an optimum strategy? ©MapR Technologies - Confidential 20
  • 21. Bayesian Bandit  Compute distributions based on data so far  Sample p1, p2 and p2 from these distributions  Pick shell i where i = argmaxi pi  Lemma 1: The probability of picking shell i will match the probability it is the best shell  Lemma 2: This is as good as it gets ©MapR Technologies - Confidential 21
  • 22. And it works! 0.12 0.11 0.1 0.09 0.08 0.07 regret 0.06 ε- greedy, ε = 0.05 0.05 0.04 Bayesian Bandit with Gam m a- Norm al 0.03 0.02 0.01 0 0 100 200 300 400 500 600 700 800 900 1000 1100 n ©MapR Technologies - Confidential 22
  • 23. Video Demo ©MapR Technologies - Confidential 23
  • 24. The Code  Select an alternative n = dim(k)[1] p0 = rep(0, length.out=n) for (i in 1:n) { p0[i] = rbeta(1, k[i,2]+1, k[i,1]+1) } return (which(p0 == max(p0)))  Select and learn for (z in 1:steps) { i = select(k) j = test(i) k[i,j] = k[i,j]+1 } return (k)  But we already know how to count! ©MapR Technologies - Confidential 24
  • 25. The Basic Idea  We can encode a distribution by sampling  Sampling allows unification of exploration and exploitation  Can be extended to more general response models ©MapR Technologies - Confidential 25
  • 26. The Original Problem x2 x1 Bogus Dog Food is the Best! Now available in handy 1 ton bags! Buy 5! x3 ©MapR Technologies - Confidential 26
  • 27. Response Function æ ö p(win) = w çåqi xi ÷ è i ø 1 0.5 y 0 -6 -4 -2 0 2 4 6 x ©MapR Technologies - Confidential 27
  • 28. Generalized Banditry  Suppose we have an infinite number of bandits – suppose they are each labeled by two real numbers x and y in [0,1] – also that expected payoff is a parameterized function of x and y E [ z ] = f (x, y | q ) – now assume a distribution for θ that we can learn online  Selection works by sampling θ, then computing f  Learning works by propagating updates back to θ – If f is linear, this is very easy  Don’t just have to have two labels, could have labels and context ©MapR Technologies - Confidential 28
  • 29. Context Variables x2 x1 Bogus Dog Food is the Best! Now available in handy 1 ton bags! Buy 5! x3 user.geo env.time env.day_of_week env.weekend ©MapR Technologies - Confidential 29
  • 30. Caveats  Original Bayesian Bandit only requires real-time  Generalized Bandit may require access to long history for learning – Pseudo online learning may be easier than true online  Bandit variables can include content, time of day, day of week  Context variables can include user id, user features  Bandit × context variables provide the real power ©MapR Technologies - Confidential 30
  • 31. You can do this yourself! ©MapR Technologies - Confidential 31
  • 32. Thank You ©MapR Technologies - Confidential 32
  • 33. Contact: – tdunning@maprtech.com – @ted_dunning  Slides and such (available late tonight): – http://slideshare.net/tdunning  Hash tags: #mapr #hivedata ©MapR Technologies - Confidential 33