Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Machine learning principles for on page conversion optimization and its use in Maxymizely

207 views

Published on

In our presentation, we'll explain the difference between the simple content A/B testing (also known as a one-tailed test) and the on-page conversion rate optimisation method with the multi-armed bandit algorithm (also known as a two-tailed test). We describe the possibilities and the principles of the automatic on-page conversion optimisation used in our service Maxymizely.com

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Machine learning principles for on page conversion optimization and its use in Maxymizely

  1. 1. Machine learning principles for on-page conversion optimization used in Maxymizely
  2. 2. One-tailed test vs. Two-tailed test ● A one-tailed test (or simple A/B testing) allows you to determine if one essence is better or worse than another one but not the both. A direction must be chosen before the test. ● A two-tailed test (or multi-armed bandit algotithm) allows you to determine if two website essences are different from one another. A direction does not have to be specified before the test. The fact is that automatic conversion optimization will take into account the possibility of both, a positive and a negative effect.
  3. 3. So, what is better to choose?
  4. 4. The limitations of simple A/B testing Despite the fact that A/B testing (also known as a ‘one-hand bandit’ algorithm) is the known method of negotiation the impact of personal bias, it has some limitations, such as: ● While test is running the users have to see a ‘bad version’ 50% of the whole testing time. ● The method includes a human factor: when to stop the test and which version is the best. ● The method requires lots of samples and lots of rounds to go.
  5. 5. Two-tailed test (the multi-armed bandit principle) It’s that very multi-armed bandit algorithm used for automatic conversion optimization (also known as machine learning in on-page conversion optimization) :)
  6. 6. Bandit Algorithms for On-page Conversion Rate Optimization
  7. 7. Machine learning for conversion optimization principle Plainly speaking, it’s a two-tailed testing process, that consists of multiplicity of repeating testing rounds. Each testing round consists of the exploration and exploitation phases, the combination of which helps to find the best working balance. Practically and plainly speaking it means that the best performing page is shown for the maximally possible number of times, while the worst performing pages almost are not shown.
  8. 8. The short review of the various multi-armed bandit algorithms Group of methods/strategies Pros Cons The iterative strategies (dynamic programming, Gittins indices, UCB1) Provably optimal strategies Limited horizon, a short sequence of steps, the computational complexity of the problem when you zoom, very slow convergence Linear reward actions Dynamic weight update, fast conversion Sensitive to the initial approximation Heuristic methods (e-greedy, Softmax, exp 3, etc.) Not sensitive to an increase in the scale of the problem More often lead to sub-optimal solutions
  9. 9. The multi-armed bandit algorithm principles used in Maxymizely To improve the reliability of results, at Maxymizely we use the combination of such methods as epsilon-greedy and linear reward-action. 1. The exploration phase. The equal traffic distribution (10% of the whole traffic). During the first phase we’re using the principles of retraining, taken from the E-greedy algorithm 2. The exploitation phase. The most successful variations get the most of the traffic. Our system detects changes in conversion of each variation and adjusts the weight, according to the probability to win. We also take into account the speed of weight changes to compensate the errors. During the second phase, the algorithm is using the linear reward-action method.
  10. 10. The linear reward-action principles The linear reward-action is based on the principles of PID controllers and consists of two components: the differentiate component (D) is used to define the speed of change in probability to win and the proportional integral component (PI) is the current probability to win in a test. To evaluate the probability of winning we use the Bayesian approach. There is an example of a handy Bayesian calculator helping to visualise the estimate of the probability that A beats B given the data you have. In a low quantity of attempts it looks like a beta-distribution in terms of statistics, but on the large samples it’s a completely normal distribution. Our algorithm is retraining up to 4 times a day, providing optimal solutions for maximization of gained profit. And if our algorithm considers that the variation has a 100% chance of winning, after a while it will get the 100% weight in the 90% exploitation phase in order to maximize your profit.
  11. 11. The limitation of Maxymizely’s machine learning algorithm Generally, our multi-armed bandit exceeds the possibilities of the plain A/B testing in the sense of the fastest conversion increase. The only limitation you can face launching this method is the quantity of your traffic. The minimum amount of traffic you have to provide for every arm of a multi-armed bandit is 500 unique visitors.
  12. 12. AB testing vs. Automatic conversion optimization ● Minimize losses: A/B tests send equal amounts of visitors to pages, no matter how well they perform. The Maxymizely’s bandit algorithm, on the other hand, will learn again and again to send visitors to the best performing page. In this sense, the bandit algorithm has an incontestable advantage over ordinary A/B tests. Also, bandits allow you to distinguish relatively similar performance between two versions of a page. ● Sample size: For multi-armed bandits you generally require fewer observations in order to come to a conclusion for results with the same level of confidence as an A/B test. The difference lies in different approaches to define the ‘winner’; and both of these statistical approaches are equally reliable. ● Easy to set-up: Certain conditions in setting up a bandit algorithm are difficult to implement, that’s why this method has been less popular before the start of the modern machine learning practices. While A/B testing has been used for over a century, bandit algorithms haven’t been used mainly because the calculation time is much longer than in an ordinary A/B test. However today the iteration of a Bayesian bandit takes less than a second for our computers, and this is why we are seeing the return in popularity for such optimization method. Easy to set up and use!
  13. 13. Thank you for the attention! For more information visit our website: http://maxymizely.com/

×