Successfully reported this slideshow.
Your SlideShare is downloading. ×

The anatomy of an A/B Test - JSConf Colombia Workshop

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 72 Ad

The anatomy of an A/B Test - JSConf Colombia Workshop

Download to read offline

“In God we trust, all others must bring data”. Intuition, experience and well known patterns may give us good indications of successful ideas and features, but nothing gets closer to the truth than data analysis and A/B testing. In this workshop, we’ll show how we do experimentation at Booking: what we test, how to get data through templates and JavaScript, and how we analyse the resulting metrics. We’ll live-code examples, see all potential caveats of dealing with the user tracking on the client-side, and show existent tools you can use to test your own ideas.

“In God we trust, all others must bring data”. Intuition, experience and well known patterns may give us good indications of successful ideas and features, but nothing gets closer to the truth than data analysis and A/B testing. In this workshop, we’ll show how we do experimentation at Booking: what we test, how to get data through templates and JavaScript, and how we analyse the resulting metrics. We’ll live-code examples, see all potential caveats of dealing with the user tracking on the client-side, and show existent tools you can use to test your own ideas.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Viewers also liked (20)

Advertisement

Similar to The anatomy of an A/B Test - JSConf Colombia Workshop (20)

More from Eduardo Shiota Yasuda (12)

Advertisement

Recently uploaded (20)

The anatomy of an A/B Test - JSConf Colombia Workshop

  1. 1. A/B testing workshop “In God we trust, all others must bring data” JSConf Colombia Workshop 2015
  2. 2. @shiota github.com/eshiota slideshare.net/eshiota eshiota.com
  3. 3. A/B
  4. 4. A/B tests measure how a new idea (version B/variant/test) performs agains an existing implementation (version A/base/control).
  5. 5. Buy now Buy nowversus
  6. 6. coin flip Buy now Buy now 50% 50%
  7. 7. When the user sees or is affected by the idea, they are tracked and become part of the test.
  8. 8. Buy now Buy now track(my_experiment)
  9. 9. Data about the website is generated as users browse through pages and do their tasks.
  10. 10. product added to cart number of products added purchase finished average price per purchase number of products seen user has logged in used guest checkout customer service calls …
  11. 11. When there’s enough information to make a decision, you can either stop the test (keeping version A) or choose version B, directing all traffic to it.
  12. 12. Buy now Buy now Duration: 14 days Visitors: 45.140 (22.570 per variant) 339 (1.5%) 407 (1.8%) 20% up 144.500 COP 147.390 COP 2% up Number of purchases: Average price:
  13. 13. coin flip Buy now Buy now 50% 50%
  14. 14. B Buy now 100%
  15. 15. "But my design is obviously more beautiful and intuitive than what we have now! Why should I run an A/B test?” — the majority of designers
  16. 16. Quiz time!(prizes included)
  17. 17. A: Raise your left hand B: Raise your right hand Neutral: Don’t raise your hands Which performed better?
  18. 18. Reduced bounce rate in 1.7%
  19. 19. A: Raise your left hand B: Raise your right hand Neutral: Don’t raise your hands Which performed better?
  20. 20. Which performed better? Increased CTR in 203%
  21. 21. A: Raise your left hand B: Raise your right hand Neutral: Don’t raise your hands Which performed better?
  22. 22. Which performed better? 43.4% more purchases
  23. 23. A: Raise your left hand B: Raise your right hand Neutral: Don’t raise your hands Which performed better?
  24. 24. Both were statistically equivalent Which performed better?
  25. 25. Intuition vs Historical Analysis vs. Experimentation
  26. 26. We have a 2/3 chance of being wrong when trusting our intuition.
  27. 27. People behave differently each season/month/day of the week.
  28. 28. Different cultures lead to different patterns of usage.
  29. 29. Data analysis alone provides correlation but not causation.
  30. 30. Running your A/B test (in 5 simple steps)
  31. 31. Step 1: Hypothesis
  32. 32. Analyse all possible inputs to come up with an hypothesis to work on.
  33. 33. • Usability research • Benchmarking • Surveys • Data mining • Previous experiments
  34. 34. Hypothesis: “If users from South America countries relate more to the website, they will book more.”
  35. 35. Step 2: Idea
  36. 36. Idea: “If we add the country’s flag next to the website’s logo, users will relate more to the brand.”
  37. 37. Step 3: Setup
  38. 38. • Who will participate? • What is the primary metric? • Any secondary impacts? • How will it be implemented?
  39. 39. • Users from Argentina, Bolivia, Brazil, Chile, Colombia, Ecuador, Guyana, Paraguay, Peru, Suriname, Uruguay and Venezuela, on all platforms • Conversion (net bookings) uplift is expected • We expect more returning customers
  40. 40. <h1 class="main-header__logo logo"> <% if user.is_from_south_america && track_experiment(:header_flag_for_south_america) == "b" %> <span class="main-header__logo__country-flag"> <%= user.country %> </span> <% end %> <%= image_tag "logo.png" %> </h1>
  41. 41. Step 4: Monitoring
  42. 42. Keep checking the metrics to see if anything’s terribly wrong.
  43. 43. Avoid checking too often, let your test get enough users and enough runtime.
  44. 44. Step 5: Data, decisions, and next steps
  45. 45. When you reach the expected runtime, number of visitors or effect, look at the data and take a decision.
  46. 46. product added to cart number of products added purchase finished average price per purchase number of products seen user has logged in used guest checkout customer service calls …
  47. 47. Optimizely dashboard
  48. 48. • How were the primary and secondary metrics impacted? • What were the results isolated by each country? • What were the results isolated by each language? • Did any particular platform (desktop, mobile devices, tablets) perform better? • Was the impact on returning customers any higher than first time visitors?
  49. 49. Based on the gathered data, plan for next steps.
  50. 50. • Should we add a copy to the flag? • Should we add a tooltip to the flag? • Should we increase/decrease the flag size? • Should we restrict it just for desktop users? • Should we try this for a single country, or other countries?
  51. 51. What can you test?
  52. 52. (almost) Everything.
  53. 53. You can test a small design change.
  54. 54. versus
  55. 55. You can test large design changes.
  56. 56. versus
  57. 57. You can test different copies.
  58. 58. versus Submit Book now
  59. 59. You can test technical improvements and measure page load time, repaints/reflows, and conversion impact.
  60. 60. versus jQuery 1.11.3 jQuery 2.1.3
  61. 61. You can even test back-end optimisations and measure page load time, rendering time, CPU and memory usage etc.
  62. 62. if track_experiment(:my_optimized_query) @users = my_optimized_query else @users = do_the_normal_thing end
  63. 63. Live coding (I hope that works.)
  64. 64. Find the code at: https://github.com/eshiota/ab_workshop Additional links: https://www.optimizely.com/ https://github.com/splitrb/split/ http://whichtestwon.com http://unbounce.com/ http://blog.booking.com/hamburger-menu.html http://blog.booking.com/concept-dne-execution.html Gracias!

×