Online LearningBayesian bandits and more©MapR Technologies - Confidential   1
whoami – Ted Dunning     Ted Dunning       tdunning@maprtech.com       tdunning@apache.org       @ted_dunning     We’re ...
Online                                Scalable                                     Incremental©MapR Technologies - Confide...
Scalability and Learning     What does scalable mean?     What are inherent characteristics of scalable learning?     W...
Scalable ≈ On-line                                    If you squint just right©MapR Technologies - Confidential           ...
unit of work ≈ unit of time©MapR Technologies - Confidential   6
Infinite        Data                        Learning     Stream                                     State©MapR Technologie...
Pick One©MapR Technologies - Confidential       8
©MapR Technologies - Confidential   9
©MapR Technologies - Confidential   10
Now pick again©MapR Technologies - Confidential         11
A Quick Diversion     You see a coin       –   What is the probability of heads?       –   Could it be larger or smaller ...
Which One to Play?     One may be better than the other     The better coin pays off at some rate     Playing the other...
A First Conclusion     Probability as expressed by humans is subjective and depends on      information and experience©Ma...
A Second Conclusion     A single number is a bad way to express uncertain knowledge     A distribution of values might b...
I Dunno©MapR Technologies - Confidential   16
5 and 5©MapR Technologies - Confidential   17
2 and 10©MapR Technologies - Confidential   18
The Cynic Among Us©MapR Technologies - Confidential   19
Demo©MapR Technologies - Confidential     20
An Example©MapR Technologies - Confidential   21
An Example©MapR Technologies - Confidential   22
The Cluster Proximity Features     Every point can be described by the nearest cluster       –   4.3 bits per point in th...
Diagonalized Cluster Proximity©MapR Technologies - Confidential   24
Lots of Clusters Are Fine©MapR Technologies - Confidential   25
Surrogate Method     Start with sloppy clustering into κ = k log n clusters     Use these clusters as a weighted surroga...
Algorithm Costs     O(k d log n) per point for Lloyd’s algorithm          … not so good for k = 2000, n = 108     Surrog...
30,000 times faster sounds good©MapR Technologies - Confidential       28
30,000 times faster sounds good                                    but that isn’t the big news©MapR Technologies - Confide...
30,000 times faster sounds good                                    but that isn’t the big news                            ...
Parallel Speedup?                                        200                                                              ...
What about deployment?©MapR Technologies - Confidential      32
Infinite        Data                        Learning     Stream                                     State©MapR Technologie...
Data                                      Mapper                              Split                                       ...
Data                                      Mapper                                      Mapper                              ...
whoami – Ted Dunning     We’re hiring at MapR     Ted Dunning       tdunning@maprtech.com       tdunning@apache.org     ...
Upcoming SlideShare
Loading in...5
×

Strata new-york-2012

579

Published on

This set of slides describes several on-line learning algorithms which taken together can provide significant benefit to real-time applications.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
579
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
14
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Strata new-york-2012

  1. 1. Online LearningBayesian bandits and more©MapR Technologies - Confidential 1
  2. 2. whoami – Ted Dunning Ted Dunning tdunning@maprtech.com tdunning@apache.org @ted_dunning We’re hiring at MapR For slides and other info http://www.slideshare.net/tdunning©MapR Technologies - Confidential 2
  3. 3. Online Scalable Incremental©MapR Technologies - Confidential 3
  4. 4. Scalability and Learning What does scalable mean? What are inherent characteristics of scalable learning? What are the logical implications?©MapR Technologies - Confidential 4
  5. 5. Scalable ≈ On-line If you squint just right©MapR Technologies - Confidential 5
  6. 6. unit of work ≈ unit of time©MapR Technologies - Confidential 6
  7. 7. Infinite Data Learning Stream State©MapR Technologies - Confidential 7
  8. 8. Pick One©MapR Technologies - Confidential 8
  9. 9. ©MapR Technologies - Confidential 9
  10. 10. ©MapR Technologies - Confidential 10
  11. 11. Now pick again©MapR Technologies - Confidential 11
  12. 12. A Quick Diversion You see a coin – What is the probability of heads? – Could it be larger or smaller than that? I flip the coin and while it is in the air ask again I catch the coin and ask again I look at the coin (and you don’t) and ask again Why does the answer change? – And did it ever have a single value?©MapR Technologies - Confidential 12
  13. 13. Which One to Play? One may be better than the other The better coin pays off at some rate Playing the other will pay off at a lesser rate – Playing the lesser coin has “opportunity cost” But how do we know which is which? – Explore versus Exploit!©MapR Technologies - Confidential 13
  14. 14. A First Conclusion Probability as expressed by humans is subjective and depends on information and experience©MapR Technologies - Confidential 14
  15. 15. A Second Conclusion A single number is a bad way to express uncertain knowledge A distribution of values might be better©MapR Technologies - Confidential 15
  16. 16. I Dunno©MapR Technologies - Confidential 16
  17. 17. 5 and 5©MapR Technologies - Confidential 17
  18. 18. 2 and 10©MapR Technologies - Confidential 18
  19. 19. The Cynic Among Us©MapR Technologies - Confidential 19
  20. 20. Demo©MapR Technologies - Confidential 20
  21. 21. An Example©MapR Technologies - Confidential 21
  22. 22. An Example©MapR Technologies - Confidential 22
  23. 23. The Cluster Proximity Features Every point can be described by the nearest cluster – 4.3 bits per point in this case – Significant error that can be decreased (to a point) by increasing number of clusters Or by the proximity to the 2 nearest clusters (2 x 4.3 bits + 1 sign bit + 2 proximities) – Error is negligible – Unwinds the data into a simple representation©MapR Technologies - Confidential 23
  24. 24. Diagonalized Cluster Proximity©MapR Technologies - Confidential 24
  25. 25. Lots of Clusters Are Fine©MapR Technologies - Confidential 25
  26. 26. Surrogate Method Start with sloppy clustering into κ = k log n clusters Use these clusters as a weighted surrogate for the data Cluster surrogate data using ball k-means Results are provably high quality for highly clusterable data Sloppy clustering can be done on-line Surrogate can be kept in memory Ball k-means pass can be done at any time©MapR Technologies - Confidential 26
  27. 27. Algorithm Costs O(k d log n) per point for Lloyd’s algorithm … not so good for k = 2000, n = 108 Surrogate methods …. O(d log κ) = O(d (log k + log log n)) per point This is a big deal: – k d log n = 2000 x 10 x 26 = 500,000 – log k + log log n = 11 + 5 = 17 – 30,000 times faster makes the grade as a bona fide big deal©MapR Technologies - Confidential 27
  28. 28. 30,000 times faster sounds good©MapR Technologies - Confidential 28
  29. 29. 30,000 times faster sounds good but that isn’t the big news©MapR Technologies - Confidential 29
  30. 30. 30,000 times faster sounds good but that isn’t the big news these algorithms do on-line clustering©MapR Technologies - Confidential 30
  31. 31. Parallel Speedup? 200 Non- threaded ✓ 100 2 Tim e per point (μs) Threaded version 3 50 4 40 6 5 8 30 10 14 12 20 Perfect Scaling 16 10 1 2 3 4 5 20 Threads©MapR Technologies - Confidential 31
  32. 32. What about deployment?©MapR Technologies - Confidential 32
  33. 33. Infinite Data Learning Stream State©MapR Technologies - Confidential 33
  34. 34. Data Mapper Split State©MapR Technologies - Confidential 34
  35. 35. Data Mapper Mapper Split Mapper Need shared memory! State©MapR Technologies - Confidential 35
  36. 36. whoami – Ted Dunning We’re hiring at MapR Ted Dunning tdunning@maprtech.com tdunning@apache.org @ted_dunning For slides and other info http://www.slideshare.net/tdunning©MapR Technologies - Confidential 36
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×