Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building an A/B Testing Analytics System with R and Shiny

4,616 views

Published on

Given at RStudio::conf(2019).

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Building an A/B Testing Analytics System with R and Shiny

  1. 1. Building an A/B Testing Analytics System with R and ShinyEmily Robinson @robinson_es
  2. 2. About Me ➔ Data Scientist at DataCamp ➔ R user ~7 years ➔ Enjoy talking about: ◆ Building and finding data science community ◆ Diversity in STEM ◆ R
  3. 3. Learn | datacamp.com/courses
  4. 4. What is A/B Testing?
  5. 5. Life B.D. (Before DataCamp) ➔ Worked on 60+ experiments with search team ➔ 8+ year history of experimentation ➔ 500+ experiments per year
  6. 6. Life B.D. (Before DataCamp) ➔ 5 data engineers working on the experimentation platform ➔ Over a thousand metrics computed for each experiment ➔ Fancy UI From How Etsy Handles Peeking in A/B Testing by Callie McRee and Kelly Shen
  7. 7. First weeks at DataCamp ➔ No system for planning, analyzing, or presenting experiment results ➔ And no data engineers to build it
  8. 8. 4 Lessons
  9. 9. 1. Build tools to save yourself time
  10. 10. Who here has had a “first this then that” question? ➔ Who tried X then did Y? ➔ What percent of people who did X then did Y? ➔ What was the last thing people did before doing Y? ➔ What are all the things people did after doing X?
  11. 11. Questions I might answer about an A/B test ➔ What percent of people in the treatment vs. control registered? ➔ What were the ad clicks that had a course start within 2 days?
  12. 12. Lengthy, repetitive code ➔ Lots of copying and pasting ➔ Hard to switch between types of funnels
  13. 13. And when you’re doing repetitive tasks ... Package
  14. 14. Unfortunately … Me and writing packages
  15. 15. Fortunately … I had David Robinson Sorry, this David Robinson
  16. 16. Funneljoin package: github.com/datacamp/funneljoin
  17. 17. Structure 1. Table 1 2. Table 2 3. User column name(s) 4. Time column name(s) 5. Type of afterjoin 6. Type of join
  18. 18. Example: first-any ➔ What are all the courses people started after visiting the homepage for the first time?
  19. 19. Example: first-firstafter ➔ What percent of people saw the pricing page and then subscribed?
  20. 20. Example: max-gap argument ➔ What percent of people saw the pricing page and then subscribed within four days?
  21. 21. ➔ Many funnel-types: ➔ Lastbefore-firstafter, any-any, first-any … ➔ Supports all types of dplyr joins: ➔ Inner, left, right, full, semi, and anti ➔ Works on remote tables ➔ Bug fixes, pull requests, feature requests welcome ➔ Try it yourself! Funneljoin: github.com/datacamp/funneljoin
  22. 22. 2. Everything that can go wrong, will go wrong
  23. 23. Things that have happened … ➔ People are put in both control and treatment ➔ People in the experiment have no page views ➔ People have multiple experiment starts in the same group ➔ There aren’t the same number of people in control and treatment ➔ Experiment starts didn’t have cookies (so we couldn’t track user)
  24. 24. You need to check your assumptions
  25. 25. Initial solution
  26. 26. As a famous data scientist once said … When you’ve run the same process three times, make a dashboard
  27. 27. 3. Build tools that empower others
  28. 28. Health Metrics Dashboard * These are fake numbers
  29. 29. By metric view * These are fake numbers
  30. 30. By metric view * These are fake numbers
  31. 31. Individual experiments view * These are fake numbers
  32. 32. Leveling up … ➔ Common request: What % increase can we detect in a 2 week test? ➔ Can I make a tool so people can answer this themselves without code? ➔ Delivering information -> discovering information
  33. 33. Impact calculator
  34. 34. Impact calculator
  35. 35. 4. Make it easy to do the right thing
  36. 36. ➔ Clarifies decision-making ➔ Can have additional “guardrail” metrics that you don’t want to negatively impact Best Practice 1: Have one key metric per experiment
  37. 37. Airtable Field
  38. 38. Best practice 2: Run your experiment for length you’re planned on ➔ Otherwise, you may quadruple your false positive rate!
  39. 39. Show start and end date in dashboard
  40. 40. Conclusion
  41. 41. Recap 1. Build tools to save yourself time 2. Everything that can go wrong will go wrong 3. Build tools that empower others 4. Make it easy to do the right thing
  42. 42. Many thanks to … ➔ The growth and data science teams at DataCamp ➔ Anthony Baker & David Robinson, co-authors of funneljoin ➔ Analytics & Data Engineering team at Etsy
  43. 43. Thank you! hookedondata.org @robinson_es github.com/datacamp/funneljoin

×