Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Conversion and the art of data driv... by Online Dialogue 1118 views
- Conversion World 2015 Keynote by To... by Online Dialogue 1316 views
- #Conversion2015 Amsterdam keynote T... by Online Dialogue 3755 views
- When to run how many A/B-tests (bas... by Online Dialogue 870 views
- Facebook conference 2015 Bart Schut... by Online Dialogue 6905 views
- You’re emailing a brain - Email Sum... by Online Dialogue 7776 views

Keynote from Annemarie Klaassen of Online Dialogue at SuperWeek Jamaica, May 10th 2016

No Downloads

Total views

2,916

On SlideShare

0

From Embeds

0

Number of Embeds

1,014

Shares

0

Downloads

0

Comments

9

Likes

3

No notes for slide

I have 8 years of webanalytics experience and I ‘m always looking to find real insights from user data.

Every A/B-test I have ever done has been analyzed using Excel.

I have build my own test evaluation tools based on my statistical knowledge from University.

You will get to see some screenshots of these Excel tools I’ve build in this presentation. And one of these tools has now been released as a webtool as well. I’ll come back to this later.

So this conference with fellow nerds on a tropical island is just perfect!

First we look at the data and determine the pages with the highest test power: which pages have enough visitors and conversions to be able to test on?

Then we look at the paths visitors take on the website to make a booking or place an order. So, what are the main online customer journeys? And where are the biggest leaks in this process?

These data findings are then send to the psychologist. He or she combines this data with scientific research to come up with hypothesis to test

These hypothesis are then briefed to the designer who will come up with test variations.

These variations are then tested in several A/B-tests (since you cannot prove an hypothesis based on one experiment)

The learnings of these A/Btests are then combined in overall learnings which can then be shared with the rest of the organization.

And in the long run to really learn from user behavior.

What is it that triggers the visitors on that particular website?

And how can we use those insights to come up with even better optimization efforts on other parts of the site?

If not everyone in your organisation believes in A/B-testing you will have a hard time proving its worth.

So you improve your conversion study (refer to talk Peep)

- Test bolder changes. Bolder changes normally means you are more likely to change visitor behaviour.

- And/or run it on more traffic, to be able to recognize lower uplifts

When you use a t-test (which we all have been using) you state a null hypothesis. You may recall this from your statistics classes. You calculate the p-value and decide to reject the null hypothesis or not. So you try to reject the hypothesis that the conversion rates are the same.

You measured an uplift of 9,58% - and the graph indicates you have a winner

And the focus is on finding those real winners. You want to take as little risk as possible.

This stems from the fact that t-tests have been used in a lot of medical research as well. Of course you don’t want to bring a medicine to the market if you’re not 100% sure that it won’t make people worse of kill them. But businesses aren’t run this way. You need to take some risk to grow your business.

However, there seems to be a positive movement (the measured uplift is 5%), but it isn’t big enough to recognize as a significant winner. You probably only need a few more conversions.

The most common approach to analysing A/B-tests is the t-test (which is based on frequentist statistics).

But, over the last couple of years more and more software packages (like VWO) are switching to Bayesian statistics.

Second, it shows you the probability that B is better than A. Probability is very easy to explain. Everyone understands this.

You see the number of users and conversions per variation, the conversion rates, the measured uplift and the chance that B is better than A.

Easy right?

In the graph you see the chances of the expected uplift after implementation. These lie in a range. The more traffic and conversions the more certain you are of the actual uplift.

And he / she will understand this result.

A test result won’t tell you winner / no winner, but a percentage between 0 and 100% whether the variation performs better than the original

Does the chance of an uplift outweigh the chance of loosing money when you would implement the variation?

But in order to earn back the costs of implementation you want to know the chance it will earn you at least x revenue.

Say for example that a test implementation costs 15.000 and your margin is 20% - so you need at least 75.000 extra revenue.

So what is the chance that the implementation of the variation will earn you that amount within 6 months? In this example this is still a 82% chance. So you will probably implement it.

We calculated the expected revenue over 6 months time ( this is a ballpark for how long a A/B-test result has effect on the business – the environment changes / some effects will be longer, some shorter)

The first graph will show you the main test result + also the chance of at least x revenue.

The average uplift of the green bars is 5,92%

And you have a 89.1% chance of an uplift in conversion rate of 5.92%

Multiply 10.9 times the drop in revenue plus 89.1 times the uplift in revenue and you have the contribution of this test.

The cost of implementation is 15 thousand dollar

In this example you have measured a change in conversion rate of 5%

If you take the 15 thousand dollar investment

You can set certain cut-off values as to when to implement. If it takes longer than 3 months to earn it back, ‘then don’t implement.

Anything between 85 en 95% is a very strong indication. So we would do follow-up tests on other parts of the website to see if it works there too. And the same as with a t-test: when the chance is higher than 95% we see it as a real learning.

So even though you would implement the previous test, it doesn’t prove the stated hypothesis. It shows a strong indication, but to be sure the hypothesis is true you need follow-up tests to confirm this learning.

When you use a Bayesian test evaluation than the number of tests that are implemented rise to 29!. This is a whopping 58%.

But because the uplift is way higher as well you end up with a higher contribution for the Bayesian approach in the end.

- 1. SUBTITLE BELOW
- 2. @AM_Klaassen A bit about me…
- 3. @AM_Klaassen
- 4. What I do…
- 5. @AM_Klaassen My lovely colleagues
- 6. @AM_Klaassen Conversion rate optimization Analytics Psychology
- 7. @AM_Klaassen
- 8. Lots of A/B-tests
- 9. @AM_Klaassen Adding direct value Learning user behavior
- 10. The challenge of a successful A/B-test program
- 11. You need at least 1 winner in 2 weeks
- 12. Low energy in test team
- 13. Low visibility in the organization
- 14. A/B-test program dies
- 15. We have 1 in 4 significant winners
- 16. Most only 1 in 8
- 17. So you need 4 tests per week
- 18. You need high traffic volumes
- 19. So, we need more winners!
- 20. 2 Solutions
- 21. 1. Improve your test program
- 22. @AM_Klaassen Improve your implementation rate
- 23. @AM_Klaassen Improve your implementation rate
- 24. @AM_Klaassen Improve your implementation rate
- 25. 2. Redefine your winners
- 26. Challenges of Frequentist statistics
- 27. @AM_Klaassen Challenge 1: Hard to Understand In a frequentist test you state a null hypothesis: H0 = Variation A and B have the same conversion rate
- 28. @AM_Klaassen Say for example you did an experiment and the p-value of that test was 0.01. Challenge 1: Hard to Understand http://onlinedialogue.com/abtest-visualization-excel/
- 29. @AM_Klaassen Which statement about the p-value (p=0.01) is true? a) You have absolutely disproved the null hypothesis: that is, there is no difference between the variations b) There is a 1% chance of observing a difference as large as you observed even if the two means are identical c) There’s a 99% chance that B is better than A Challenge 1: Hard to Understand
- 30. @AM_Klaassen Which statement about the p-value is true? a) You have absolutely disproved the null hypothesis: that is, there is no difference between the variations b) There is a 1% chance of observing a difference as large as you observed even if the two means are identical c) There’s a 99% chance that B is better than A Challenge 1: Hard to Understand
- 31. @AM_Klaassen Which statement about the p-value is true? a) You have absolutely disproved the null hypothesis: that is, there is no difference between the variations b) There is a 1% chance of observing a difference as large as you observed even if the two means are identical c) There’s a 99% chance that B is better than A Challenge 1: Hard to Understand
- 32. @AM_Klaassen So the p-value only tells you: How unlikely is it that you found this result, given that the null hypothesis is true (that there is no difference between the conversion rates) Challenge 1: Hard to Understand
- 33. @AM_Klaassen Confused HiPPO
- 34. @AM_Klaassen Challenge 2: Focus on finding proof
- 35. @AM_Klaassen Challenge 2: Focus on finding proof
- 36. @AM_Klaassen Challenge 2: Focus on finding proof
- 37. @AM_Klaassen What’s the alternative? Frequentist statistics Bayesian statistics
- 38. Advantages of Bayesian statistics
- 39. @AM_Klaassen 1. No statistical terminology involved 2. Answers the question directly: ‘what is the probability that variation B is better than A’ Advantage 1: Easy to understand
- 40. @AM_Klaassen Advantage 1: Easy to understand
- 41. @AM_Klaassen Remember…?
- 42. @AM_Klaassen Advantage 1: Easy to understand
- 43. Happy HiPPO
- 44. @AM_Klaassen A test result is the probability that B outperforms A: ranging from 0% - 100% Adv 2: Focus on risk assessment
- 45. @AM_Klaassen Adv 2: Focus on risk assessment 11% 89% Download PDF: ondi.me/change
- 46. Depends on the cost
- 47. @AM_Klaassen Take the cost into account
- 48. @AM_Klaassen Take the cost into account
- 49. @AM_Klaassen Ondi.me/bayes/
- 50. @AM_Klaassen Make a risk assessment IMPLEMENT B PROBABILITY Expected risk 10.9% Expected uplift 89.1% Contribution
- 51. @AM_Klaassen Make a risk assessment
- 52. @AM_Klaassen Make a risk assessment IMPLEMENT B PROBABILITY AVERAGE DROP/UPLIFT Expected risk 10.9% -1.85% Expected uplift 89.1% 5.92% Contribution
- 53. @AM_Klaassen Make a risk assessment IMPLEMENT B PROBABILITY AVERAGE DROP/UPLIFT * EFFECT ON REVENU Expected risk 10.9% -1.85% - $ 115,220 Expected uplift 89.1% 5.92% $ 370,700 Contribution $ 317,936 * Based on 6 months and an average order value of € 100
- 54. @AM_Klaassen Calculate the ROI IMPLEMENT B BUSINESS CASE Contribution $ 317,936
- 55. @AM_Klaassen Calculate the ROI IMPLEMENT B BUSINESS CASE Contribution $ 317,936 Margin (20%) $ 63,587 Cost of implementation $ 15,000
- 56. @AM_Klaassen Calculate the ROI IMPLEMENT B BUSINESS CASE Contribution $ 317,936 Margin (20%) $ 63,587 Cost of implementation $ 15,000 ROI 424%
- 57. @AM_Klaassen Or the payback period IMPLEMENT B BUSINESS CASE Average CR change 5.00%
- 58. @AM_Klaassen Or the payback period IMPLEMENT B BUSINESS CASE Average CR change 5.00% Extra margin per week $ 2,400 Cost of implementation $ 15,000
- 59. @AM_Klaassen Or the payback period IMPLEMENT B BUSINESS CASE Average CR change 5.00% Extra margin per week $ 2,400 Cost of implementation $ 15,000 Payback period 6.25 weeks
- 60. We still need the scientist
- 61. @AM_Klaassen Adding direct value Learning user behavior
- 62. @AM_Klaassen The cut-off probability for implementation is not the same as the cut-off probability for a learning CHANCE LEARNING? < 70 % No learning 70 – 85 % Indication – need retest to confirm 85 – 95 % Strong indication – need follow-up test to confirm > 95 % Learning We still need the scientist!
- 63. Comparison
- 64. @AM_Klaassen Comparison both methods • 50 A/B-tests, • 50.000 visitors per variation, • conversion rate of 2%, • average order value of $100, • minimum contribution of $150,000 in 6 months time • (equivalent to $30,000 extra margin : ROI of 200%)
- 65. @AM_Klaassen Comparison both methods
- 66. @AM_Klaassen Comparison both methods
- 67. @AM_Klaassen Comparison both methods FREQUENTIST BAYESIAN Implementations 10 29
- 68. @AM_Klaassen Comparison both methods FREQUENTIST BAYESIAN Implementations 10 29 Expected uplift $ 4,682,600 $11,068,800 Expected risk $ 234,130 $ 2,984,800
- 69. @AM_Klaassen Comparison both methods FREQUENTIST BAYESIAN Implementations 10 29 Expected uplift $ 4,682,600 $11,068,800 Expected risk $ 234,130 $ 2,984,800 Risk % 5% 27% Contribution $4,448,470 $9,757,489 Margin (20%) $ 889,974 $1,951,498
- 70. @AM_Klaassen Maximize margin $
- 71. Implementation rate Higher
- 72. Revenue and margin Maximize
- 73. Happy HiPPO
- 74. Higher in test teamenergy
- 75. Higher in the organization visibility
- 76. A/B-test programSuccessful
- 77. THANK YOU! Download PDF: ondi.me/change Bayesian calculator: ondi.me/bayes Slide deck: ondi.me/annemarie @AM_Klaassen annemarie@onlinedialogue.com nl.linkedin.com/in/amklaassen

No public clipboards found for this slide

Login to see the comments