This document discusses improving digital experiences through digital experimentation. It provides an overview of digital experimentation and its benefits, such as increasing engagement, improving customer experience, and increasing subscriptions. It then discusses building a culture of experimentation and presents a maturity model with five levels to benchmark programs. The rest of the document focuses on operational and performance benchmarks to measure velocity, statistical significance, win rate, variation count, goal count, and test duration to identify areas for improvement and help organizations progress to higher maturity levels.
6. 6
Digital experimentation is the reliable process of delivering
winning digital experiences without guesswork or risk.
7. 7
Improve ROI on digital innovations and
investments
Test everything and remove guesswork
Gain insight into what your customers
really want
Move fast and scale up your development
process
Why Digital
Experimentation?
8. 8
What impact can Digital Experimentation make?
Increase engagement
with video content
on iPlayer
50%
increase in ‘next
episode’ views
Improve customer
experience by creating a
Digital Center of
Excellence
Seven figure
saving on call
center costs
Increase Instant Ink
subscription service
signs ups
37%
increase in
enrollment
13. 13
Category Level 1/2 Level 5
Team Limited resources – 10 hours a week or less 120 hours a week or more – Center of Excellence
Culture
Driven by front line staff and limited to one
team/department
Management may or may not be aware
Entire C-suite is actively involved in
experimentation
Value is publicly discussed
Strategy
No roadmap or list of ideas
Front end experiments only
Roadmap that’s linked to customer’s strategic
roadmap
Experimentation across all online touchpoints with
customers – front and back end
Technology
No integrations or analytics only Analytics & custom API integrations, along with 1st
and 3rd party data
Examples of Level Behaviours
38. 38
Start tomorrow
with…
Complete the Maturity Model evaluation at
Optimizely.com/maturity-model
Establish your program benchmark
measures
Leverage Optimizely’s Program
Management
Set roadmap variables that create unique
measures
Review regularity your progress
39. 39
Webinar 2: What next?
How to start moving your experimentation
to the next level.
Tuesday 22nd January 2019
12:00 GMT / 13:00 CET
Webinar 3: Been there, done that.
Learn how leading brands have taken their digital
to new heights.
Friday 25th January 2019
12:00 GMT / 13:00 CET
Welcome everyone to the first webinar of three in the Optimsim series. Thank you for joining us today.
My name is Jil Maassen. I am the Senior Strategy Consultant here in our EMEA office at Optimizely. I advise our customers post-sales on their digital experimentation strategy and help them optimize their programs.
Before we get started, there are a few housekeeping topics I would like to mention.
The webinar will be recorded and shared with you afterwards. Also, you will receive the slides, so no need to take screenshots and pictures with your phone.
There will be a chance to ask questions at the end of the webinar. You can use your control panel to ask questions throughout, but please keep in mind that these will only be answer at the end.
But now, let’s get started!
Today, we’ll take you through the steps on your journey to implementing a high-performance program that unleashes the full power of experimentation. Understand where you are on your journey. Discover the benchmarks that measure your progress. And get hold of the insightful, actionable ideas that can take your program to the next level.
We will get started with a brief introduction to digital experimentation and the 5 levels on your journey
Then move on to discover the 3 operational and 3 performance metrics that benchmark your progress
Lastly, we will touch on find out how to start implementing your own high-performance program
We here at Optimizely like portraying the optimization journey as a mountain. It helps us to visually represent our experimentation maturity curve. It goes from the very beginnings of building an experimentation program all the way to creating a true culture of experimentation at the summit of the mountain.
And this image is so important to us because it represents the journey that many of you are on. We know that this journey can be unclear or daunting at times. And even if your path is clear, there are many organizational challenges you face. Along the way, you’ll need to:
Build an experimentation team
Design an experimentation strategy for your team to implement
Adopt new technologies that allow for you to capitalize on that strategy
You have create a culture of experimentation so your broader organization can fully harness the power of experimentation
At the same time that you are on that journey organizationally, you are also are learning, managing, and mastering key operational metrics for your program, including
Velocity, or the number of experiments that you run each month, quarter, year
Learnings or how often your experiments provide a conclusive result and learning for you to act on
Complexity of your experiments, including lines of code or number of variations and the impact that has on your experiments
Efficiency, where you are improving your processes and governance to run an impactful program
And because experimentation is a new disciplines for many companies, it can be difficult to benchmark yourself and to know how you are doing against other companies. There just isn’t a lot of information or data out there for experimentation and optimization programs. But that’s where we can help today.
Before we drive deep into to benchmarks, I want to make sure we are all on the same page and talk to you a little about what digital experimentation really is.
Digital experimentation allows you to gain the reliable insight you need to continually fine tune every aspect of the customer journey.
By replacing the guesswork with hard data, you consistently deploy compelling experiences, maintain a competitive advantage and maximize ROI from your digital investments
Experimentation is becoming the way successful companies do business: not only mega brands like Amazon, HP, VISA or Sky, but also the numerous startups for whom experimentation is an intrinsic part of their DNA.
Class-leading companies are using data to increase their understanding of customers and products, then feeding that knowledge back into development as part of a continuous process of non-stop improvement.
For them, just as it can be for you, digital experimentation is the only real way to understand what works more accurately, release it more quickly, and grow more successfully.
Improve ROI: You will only invest in innovations you know will work, rather than waste budget on those that don’t
Test everything, remove guesswork: Not only test big ideas but also drive progress through continuous incremental changes. By making decisions based on hard facts, you can to forecast the positive impact of new experiences, campaigns or features.
Gain insight in what your customers really want: Not only know what customers do online, but also improve their digital experiences. Better online experiences might not only positively influence customer experience and loyalty, but could have positive effect in other business processes.
Move fast and scale up: Digital experimentation allows for a faster development process from idea to roll out. De-risking the deployment of new experiments and features gives you amazing speed in the development of new digital experiences. Not only in your web experiences, but can be done across your entire technology stack!
BBC
The BBC have a catch-up service for television programs and radio shows called called iPlayer that allows people to get up-to-date on television and radio programs across devices
They had a goal to increase the amount of time spend and engagement with video content on iPlayer
Using Digital Experimentation, they tested autoplaying the next episode on iPlayer and were able to increase engagement by 50%
Sky.com
Sky begun investing heavily in digital innovation for Sky.com in 2015 and their focus in this area has seen them build a Digital Center of Excellence
Their focus on providing an excellent customer experience saw them run an experiment that sought to improve the experience with the call center
As a result, they were able to reduce annual call center figures by seven figures
HP
Looking to increase the number of sign-ups to their Instant Ink subscription service
They tested different enrolement offers, including offereubf a free trial and positioning the service as a printer feature
They were able to increase enrollment to their subscription service by 37% using Digital Experimentation
In fact, HP too have built a Center of Excellence that saw them run almost 500 experiment campaigns and has driven an incremental $21 million in revenue with Optimizely
You can find more details on these and other stories on the results of Digital Experimentation on the customer and resources pages of our website
Now, let’s look at your program and where you are.
The first step to understanding if you are on the right track is to identify where you currently are.
To help you identify your position in your experimentation journey, we created an online maturity model evaluation - this is a brief online evaluation that measures your team across 4 categories that I spoke about before:
Team – what types of roles do you have working with your experimentation program and how much time do they dedicate to experimentation
Culture – how widely is experimentation embraced within your organization
Strategy – what types of experiments do you run and how complex are those experiments
Technology – what supporting technologies do you use with your experimentation program
And based on your responses, it places you along our maturity curve that as you saw yesterday has 5 levels:
Level 1 for executional start – this is the first stage when you are just getting started and harnessing initial learnings of experimentation
Level 2 foundational growth – is about taking those initial learnings and building a solid foundation with a team and clear strategy
Level 3 cross-functional advancement – is about taking those learnings from one team and applying them to others across marketing and product
Level 4 operational excellence – is about building on your program so that it is operationally strong and finely tuned in each category
Level 5 culture of experimentation – which we’ve talked about before
You may be wondering, where do our customers fall on this curve?
Based on the 1000s of responses from current customers and other companies, this is the breakdown that we see on our experimentation maturity curve
As you’ll see a majority of our customers are at the start of their journey in levels 1 and 2; and those that are higher up are ones that have made significant organizational commitments to experimentation
What I’d take away from this if you are lower down, you are actually in a good spot, lots of companies and organizations are at the start of this journey, and just by being on the curve, you are ahead of the literally thousands of companies that have yet to embrace experimentation
And if you’re higher up, that’s awesome, but there is always room for growth, as you heard on yesterday, Sky wants to be at level 6 and we are looking forward to working with them and other customers to blow out this end of the curve
If you already taken our online evaluation you know where you are, which is great, but if not, I’d pay attention to metrics that relate to where you think you are and also where you want to go.
So what do the four categories of the model look like at each level?
Each category breaks down into several sub areas that we could spend a whole webinar on so instead, I’ll take you through what level 1/2 looks like versus level 5. Remember level ½ is where we’ve found most people are
Team:
Companies who are early in their experimentation maturity have limited or part time resources. 10 hours a week or less
By comparison, at level 5 they are spending over 120 hours a week on experimentation and may even have created a Center of Excellence
Culture
At level 1 or 2, culture is driven by the staff who are on the front line and the management team may or may not have an awareness. It’s likely to be limited to one department
At level 5, their whole C-suite is aware and engaged in the value of experimentation, even speaking publicly about the impact it’s having
Strategy
In the early phases of experimentation, it’s likely that you won’t have a roadmap or perhaps have developed to have at least a list of ideas
Furthermore, your focus will be on the front end testing things like colour, text and images
Whereas at level 5 you’ll have a fully developed roadmap that even relates to your customer’s strategic roadmap
As well as front end, you’ll also be experimenting server side and across channels on things like email, mobile and even wearable digital devices
Technology
At level 1 or 2 you’ll have no integrations or perhaps analytics at most
At level 5, on top of analytics, you’ll custom API integrations plus access to 1st and 3rd party data in your experiments to allow advanced targeting and segmentation
In order to establish a benchmark, we have taking our customers and evaluated them along certain operational metrics.
These operational metrics were identified by us to be the best indicators to identify inefficiencies in our customer programs. But please keep in mind, these are metrics that we have identified to be good indicators. This does not mean that it is an exhaustive list nor does it mean that these are the only ones relevant. It is important for you to identify the best indicators for your organization and program.
No matter where you are on the curve, there are some operational benchmarks that you are going to want to pay close attention to and what we’d like to share now
We have split them into two categories: performance and complexity benchmarks. We will look at each of the six in detail.
●Why are we saying velocity is a deciding factor for maturity?
●With experimentation you are exploring the truth, you are trying to find the single best solution for your users. But you never know the outcome of experiments in advance. And I hate to break it to you, but most of them will fail. You will have losses and inconclusive results most of the time. Thus, velocity matters. The more you experiment, the higher the chance of actually determining winners and finding that single source of truth.
●Secondly, experimentation is a motivation. It motivates people to act. There is no agonizing about perfection in the design phase, or having managers tell you what to do. It is in your hands to test what is the right way to go. Build, Test, Fail/Succeed & Learn
●Speaking about learning: experimentation will force you to understand customer behavior, it gives you a feedback loop to learn what really works and what is really needed.
As we go up in maturity, the number of experiments per month goes up. We can see a clear trend here. Of course we are not at 1000 experiments per month with most of our customers yet, however, the trend is going in the right direction.
This might seem low, however, we have filtered the data: we excluded AA tests, we excluded any tests with below 1000 visitors, we excluded any tests that have been running for months and are actually just CRM changes. So we made sure these tests are more comparable and not just random entries into our tool.
Ensure you have named all the needed disciplines and responsibilities, it requires developers to build experiments and it requires individuals to not just do it as a side project which gets deprioritized all the time. Here it really helps to attach experimentation to large company initiatives
Look to remove the development and ownership of the platform from a single person, it should be a whole team working on this. On top of that, it should not be something only one team is doing, especially if you are a large corporation.
To increase the number of experiments started per week you can use features in Optimizely that help reusability (extensions, configuring consistent metrics), make quicker decision making (Stats Accelerator) and should ensure your organization is adopting experimentation by testing design, features, etc. prior to development
We all know that data has become important and that we have a lot of data. However, we have not yet become very good at making business decisions based on data. Thus, we see it as a link to maturity of a customer when their experiments reach stats sig more often.
Furthermore, experiments are the silent voice of customers. If we can get experiments to reach stats sig, we can hear and listen to that silent voice of our customers better. We can become a more customer centric business.
Lastly, our experimentation program has to be efficient. If we can get tests to reach stats sig, we can make decisions more frequently and learn more as well as implement changes. Overall the efficiency of the program will be much better.
This is the percentage of experiments hitting a statistically significant result. We can see here there is a trend of this rate to improve as the maturity increases, which means that organizations learn how to establish better experiments and understand what it takes to reach statistical significance.
Something that we are very particular about is metrics. In every experiment you need to determine a primary metric. This is the most important metric when it comes to reach stats sig. Thus, it is also important to understand how a metric such as Total Revenue or Total Conversion Rate would not be a good metric to use. It is crucial to understand and learn why metrics have not reach statistical significance and iterate on these tests.
Estimating your sample size helps you understand when a test is bound to reach stats sig, this is a great sense check. Does it even make sense to run this test? Can I wait this long? And, it puts expectations of speed into perspective.
Develop experiments with a higher degree of change in the variation. What do we mean by that? It could be compared with “how obvious the change is”. If you are changing a small grey line somewhere on the website, even in the footer, it won’t be noticed and most likely will not reach stats sig.
Goal setting is crucial for success and also impact. The higher maturity levels understand the importance of primary metrics better than low mature customers and no longer use metrics such as Revenue or Total Conversion rate for tests as primary metrics. But, they also understand how changing metrics lower on the goal tree will impact large business KPIs.
Understanding what you want to test and why you want to test it in a structured way will help you achieve higher uplifts. There should be a business strategy also for testing to avoid waste of resources.
Using previously learnt data and test results to determine new tests to iterate on will help increase program momentum and this leads to higher uplifts.
Just to define win rate: it is the number of experiments that reached stats sig and had an uplift / positive improvement. We can see here a large difference between low maturity customers and high. Once again, this emphasizes the “learning effect” that is so fundamentally important in experimentation.
We mentioned degree of change already for stats sig. And it also applies here.
By not testing on 100% of your traffic but going higher in complexity and targeting your tests towards the right audiences, can also be a good lever for reaching better win rates.
If you tested one thing and no matter the result (positive or negative or even inconclusive), the question should be “how can we make the experience even better?”. Also take it one step further, measure how many tests were implemented on the site in the end.
So, we have had a look at 3 performance benchmarks and would like to look further into complexity of your program.
More Variations is an indication in change in culture. Companies that are only AB testing, are in one of these three phases:
HIPPO driven: One persons opinion which needs to be tested / validated
Reactive Testing Teams: Testing code just before it goes live to prevent breaking something
Do not understand the benefits of testing: One variations means less work → AB tests (so, 2 variations) just means that these are validation tests and not actual experiments.
The questions should not be “does this change work” but “which is the best change for our customer”? The aim should be to become a customer centric organization and re-adopt the notion of “the customer is king” also in the digital era.
A high number of variations also means you can take more controlled risks and bigger risks. How? With two variations you need one to be the winner. Making big changes is risky. With more variations is a good chance of success and means you can try more. It allows to also test the HIPPO theory on top of others and it allows for more creativity and innovation amongst teams.
And, here we have a bit of a surprise. There seems to be no difference… to us that means, this is something that even at higher maturity levels has not yet been adopted as well as it could be. So, this is your change to leap ahead and become an early adopter and get it right from the start.
A strong hypothesis session draws out more than one solution. We usually say, you should be able to find 10 solutions for every customer problem that you identify and want to test. This large bucket of solutions will then allow you to increase the variation count very easily but testing them against each other. It will also help you move away from validation testing or using your experimentation tool as a CRM system.
Multivariate testing is a good way to provide direction on the variables that best solve the problem. It allows you to try out different things and different combinations at once.
Don’t be focused JUST on what our direct customer data is telling us. What are our competitors doing? What are the best-in-class experiences looking like? Have you asked your customers?
By Goals count we mean the number of Metrics on Results page
The more goals, the more you are trying to understand what is going on and what is being impacted and especially why. It is a stronger test & learn focus.
Goals can be used as a control mechanism. By ensuring to measure monitoring goals that are important to the business success such as revenue on top of the test goals, you can make sure that there is no negative impact from an experiment even though the test might be positive and provide an uplift as desired.
More goals also shows that data is taken into account when making decisions. It is much more about the data from each test. Of course this assumes that when you add goals that you also look at them and analyse them….
There is a very linear upward trend here. Maturity and goal count are very much related. It once again shows the learning focus and iteration or analysis focus by the organizations further up on the curve.
Do you always track your business metrics or the KPIs that are important to achieving your business strategy? Do you always track your stepwise goals throughout the funnels which you are testing?
Think past your primary action and business metrics. What other behaviors may be influenced if your primary metric is a success? Those should be measured as well to aid ongoing ideation but also to understand customer behavior and actions even better.
How might different cohorts react in the experiment? Are you accounting for nuances and differences in actions across mobile v. desktop? Loyalty tiers? Give yourself all opportunities to learn.
Testing is about being able to act quickly especially in product development processes. Here it can be helpful to look at sample size and understand when it makes sense to run a test and not run tests but also to already filter out those test that will never reach stats sig because of too small changes or visibility and a way too large time frame.
By having an eye of your test duration you can also inform your roadmap. It can help you prioritize tests not only on impact and effort but also on time.
It will also give the program managers to get an understanding if you are running actual tests rather than using us as a CRM tool to change parts of the site but not actually test, so running indefinite tests is a very immature use of experimentation.
Test duration here is measured in days that a test ran for. This might not be an intuitive graph to you as test duration goes up with maturity however, it is actually quite obvious as more mature organizations also run more complex, fine-tuned tests that usually are targeted at specific audiences which are smaller sub-sets of their traffic and thus will take longer to test and get results on.
Upfront in your test plans state how long you anticipate an experiment running. This gives you a dotted line on when to call something.
The confidence interval from Stats Engine gives you an understanding where ”true” Improvement lies. Are you willing to accept this?
Create a consensus approach to when to call tests. What do we mean by that? It is about creating a decision matrix to determine for example when a test is forever inconclusive. This reduces conversations about “what we do in this scenario.”
So, we have now looked at our maturity model, but also 6 benchmark metrics on how you can assess your program against the market. But, what can you do tomorrow?
You could start with evaluating yourself through our maturity model. For that, follow the link or Google Optimizely Maturity Model
Establish your program benchmarks: what metrics do you care about as an organization and what are your numbers of these? How does it compare?
You can also leverage Optimizely program management tool to help you with this kind of analysis
On top of that, it is not enough to set up metrics, it is also crucial to review your progress. And here we do not mean on an annual basis. In order to act fast, it should be something you should be able to monitor easily every month and compare to the month before to see if you are running in the right direction or if you need to change direction.
If you want to know more about how to move from one level to the next, join our second webinar of the series on Tuesday January 22nd
But, not to forget, we also have a third webinar in the Optimisim series which is going to be super excited as we will look at actual real life examples of customers on how they have taking their programs to new heights. This will be on Friday Jan 25th.
But for today, we have reach our Q&A section of the webinar. I will now have a quick look at the questions that have come in so far and try to answer them.
This brings us to the end of the webinar. Thank you very much for listening. I hope you enjoyed the webinar. Feel free to contact me on Linkedin in case you should have any questions and just as a quick reminder: do not forget to sign up for the other two webinars as well.
Goodbye