Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Developing in R - the contextual Multi-Armed Bandit edition

241 views

Published on

Attached, the slides of my presentation on how to create R packages, illustrated with lessons learned in developing "contextual": a package that enables you to easily simulate and analyze contextual multi-armed bandit algorithms.

Code: https://github.com/Nth-iteration-labs/contextual

Published in: Technology
  • Be the first to comment

Developing in R - the contextual Multi-Armed Bandit edition

  1. 1. R PACKAGE DEVELOPMENT
  2. 2. • Dominant in statistics research. • Interpreted language: No need to compile before run. • At its core an Imperative Language. Also supports Functional Programming. And Object Oriented Programming. WHAT IS THE R LANGUAGE?
  3. 3. R HELLO WORLD
  4. 4. Imperative: explore data FP: data analysis OOP: building tools SO… WHICH PARADIGM TO USE IN R?
  5. 5. ALSO, R IS REAL GOOD WITH VECTORS (and matrices) 10 to 100 times faster
  6. 6. CLASS SYSTEMS • S3: minimal • S4: very verbose • R5: (reference classes) slow • C++: fast, not platform independent, needs boilerplate. • New: R6 (default at Microsoft)
  7. 7. • New kid on the block • Light weight and fast • Public and private methods • Active bindings • Mature inheritance MY PREFERENCE: R6
  8. 8. ALSO, R6 JUST MAKES ME FEEL RIGHT AT HOME…
  9. 9. • SEMANTIC DEV SKILLS • SYNTACTIC DEV SKILLS • DOMAIN KNOWLEDGE R DEVELOPMENT IN 3D
  10. 10. Semantic: What is a Multi-Armed Bandit? • Origin: Gambler in casino want to maximize winnings by playing slot machines • Balance exploration vs exploitation (also: “learning” vs “learning”) • Objective: Given a set of K distinct arms, each with unknown reward distribution, find the maximum sum of rewards. • Example: 3 slot machines (arms) Each 2 pulls explore, what now?
  11. 11. Translation to health related problem • 1. A patient arrives with symptoms, medical history at physician. • 2. Physician prescribes treatment A or treatment B. • 3. Patient’s health responds (e.g., improves, worsens). • 4. Depending on results, physician changes opinion on best treatment option for this kind of patient. • Goal: prescribe treatments that yield good health outcomes.
  12. 12. What’s the challenge for the physician here? • Fundamental dilemma • Exploit what has been learned • Explore to find which behaviors lead to high rewards • Need to use context and arm history effectively • Different actions are preferred under different contexts • Might not see the same context twice
  13. 13. Solution: use a smart rule: a policy • Policy: rule mapping context to action • Allows choice of different good actions in different contexts • E.g.: • If (sex = male) choose action 1 • Else if (age > 45) choose action 2 • Else choose action 3 • Policy 𝜋 ∶ context 𝑥 ↦ (action 𝑎) + HISTORY … adapt
  14. 14. Let’s formalize, to easier compare, apply .. • Use adaptive policy Π with distribution parameters θ to make a choice. • For t=1,2,…,T: • 1. Observe context 𝒙 𝒕 • 2. Choose action 𝒂 𝒕 ∈ {𝟏, 𝟐, … , 𝑲} using current θ of Π • 3. Collect reward 𝑟𝑡(𝑎 𝑡) • 4. Using reward 𝑟𝑡(𝑎 𝑡) adapt θ as suggested by good Π Goal: finding for choosing actions with high reward ෍ 𝑡=1 𝑇 𝑟𝑡 𝑎 𝑡
  15. 15. get_something do_procedure get_value
  16. 16. For example … • 1. • 2. • 3. • 4.
  17. 17. FIRST SKETCH, THEN CODE
  18. 18. CONTEXTUAL: UML DIAGRAMS
  19. 19. CONTEXTUAL: UML DIAGRAMS
  20. 20. CLEAN CODE Keep It Simple Stupid You Aren’t Gonna Need It Don’t Repeat Yourself !
  21. 21. ADDING LI BANDIT: EASY! PSEUDO CODE: YAY!
  22. 22. ADDING LI BANDIT: EASY! FULLY RANDOMLY OFFERED CHOICES IN REAL LIFE SETTING PEOPLE HAVE MAKE CHOICE WITH KNOWN CONTEXT THE DATA IS USED BY THE BANDIT CHECKS IF POLICY MAKES SAME CHOICE AS PERSON ORIGINALLY MADE IF SO, CAN USE THIS ROW OF INCLUSIVE CONTEXT TO TEST THE POLICY
  23. 23. RSTUDIO: REAL USEFUL
  24. 24. RSTUDIO: REAL USEFUL
  25. 25. RSTUDIO: REAL USEFUL
  26. 26. RSTUDIO: REAL USEFUL
  27. 27. RSTUDIO: REAL USEFUL
  28. 28. RSTUDIO: REAL USEFUL
  29. 29. RSTUDIO: REAL USEFUL
  30. 30. PROFVIS PROFILING
  31. 31. PRE-ALLOCATE DATA STRUCTURES
  32. 32. VERSION CONTROL / GITHUB your safety net (and makes collaboration easy)
  33. 33. ALSO HELPS AUTOMATE DEVELOPMENT RELATED PROCESSES
  34. 34. ZENODO Research Data Repository on releases: auto doi generation
  35. 35. Commit CONTINUOUS INTEGRATION: DOES YOUR CODEBASE STILL WORK?
  36. 36. CODECOV.IO INTEGRATION: DO TESTS COVER ALL OF YOUR CODE? CODECOVERAGE
  37. 37. PARALLEL PROCESSING ON AWS
  38. 38. THE ART OF PARALLEL PROCESSING 58 cores 120 cores faster Balancing overhead and network with more processing power k3 * d3 * 5 policies * 300 * 10000 58 cores: 132 seconds 120 cores: 390 seconds k3 * d3 * 5 policies * 3000 * 10000 58 cores: 930 seconds 120 cores: 691 seconds
  39. 39. • More documentation, clean result printouts • More paper writing (!) • Implement famous papers, show same results? • Refactor again: to focus less on optimization, more on readability, particularly SyntheticBandit (although…) WHAT IS NEXT?
  40. 40. R BEGINNERS R in Action- Robert Kabacoff - R in Action MORE ADVANCED R packages - Hadley Wickham Advanced R - Hadley Wickham CLEAN CODE Code Complete - Steve McConnell Clean Code - Robert C. Martin ALSO INTERESTING R Inferno - Patrick Burns LITERATURE

×