Advanced A/B Testing at Wix - Aviran Mordo and Sagy Rozman, Wix.com

Experimenting on Humans
Aviran Mordo
Head of Back-end Engineering
@aviranm
www.linkedin.com/in/aviran
www.aviransplace.com
Sagy Rozman
Back-end Guild master
@sagyrozman
www.linkedin.com/in/sagyrozman

Wix In Numbers
• Over 55M users + 1M new users/month
• Static storage is >1.5Pb of data
• 3 data centers + 3 clouds (Google, Amazon, Azure)
• 1.5B HTTP requests/day
• 900 people work at Wix, of which ~ 300 in R&D

Agenda
• Basic A/B testing
• Experiment driven development
• PETRI – Wix’s 3rd generation open source experiment
system
• Challenges and best practices
• How to (code samples)

Home page results
(How many registered)

Experiment Driven
Development

Our gallery manager
What can we improve?

Product Experiments
Toggles & Reporting
Infrastructure

How do you know what is running?

Why so many?
If I “know” it is better, do I really
need to test it?

Sign-up
The theory
Choose
Template
Edit site
Publish
Premium

Conclusion
• EVERY new feature is A/B tested
• We open the new feature to a % of users
○ Measure success
○ If it is better, we keep it
○ If worse, we check why and improve
• If flawed, the impact is just for % of our users

Sh*t happens (Test could fail)
• New code can have bugs
• Conversion can drop
• Usage can drop
• Unexpected cross test dependencies

Minimize affected users
(in case of failure)
Gradual exposure (percentage of…)
• Language
• GEO
• Browser
• User-agent
• OS
• Company employees
• User roles
• Any other criteria you have
(extendable)
• All users

Not all users are equal
• First time visitors = Never visited wix.com
• New registered users = Untainted users

We need that
feature
…and failure
is not an
option

First trial failed
Performance had to be improved

Halting the test results in loss of data.
What can we do about it?

Solution – Pause the experiment!
• Maintain NEW experience for already exposed users
• No additional users will be exposed to the NEW feature

PETRI’s pause implementation
• Use cookies to persist assignment
○ If user changes browser assignment is unknown
• Server side persistence solves this
○ You pay in performance & scalability

Decision
Keep feature Drop feature
Improve code &
resume experiment
Keep backwards compatibility for
exposed users forever?
Migrate users to another equivalent
feature
Drop it all together (users lose data/
work)

Reaching statistical significance
• Numbers look good but sample size is small
• We need more data!
• Expand
Control Group (A)75%
50%
25%
0%
Test Group (B)
25%
50%
75%
100%

Keep user experience consistent
Control
Group
(A)
Test
Group
(B)

Keeping persistent UX
• Signed-in user (Editor)
○ Test group assignment is determined by the user ID
○ Guarantee toss persistency across browsers
• Anonymous user (Home page)
○ Test group assignment is randomly determined
○ Can not guarantee persistent experience if changing
browser
• 11% of Wix users use more than one desktop
browser

Possible states >= 2^(# experiments)
# of active
experiment
Possible # of
states
10 1024
20 1,048,576
30 1,073,741,824
Wix has ~200 active experiments = 1.606938e+60

A/B testing introduces
complexity

Support tools
• Override options (URL parameters, cookies, headers…)
• Near real time user BI tools
• Integrated developer tools in the product

Define
Code
Expand Experiment
Merge
code
Close

Define spec
• Spec = Experiment template (in the code)
○ Define test groups
○ Mandatory limitations (filters, user types)
○ Scope = Group of related experiments (usually by product)
• Why is it needed
○ Type safety
○ Preventing human errors (typos, user types)
○ Controlled by the developer (developer knows about the context)
○ Conducting experiments in batch

Spec code snippet
public class ExampleSpecDefinition extends
SpecDefinition {
@Override
protected ExperimentSpecBuilder
customize(ExperimentSpecBuilder builder) {
return builder
.withOwner("OWNERS_EMAIL_ADDRESS")
.withScopes(aScopeDefinitionForAllUserTypes(
"SOME_SCOPE"))
.withTestGroups(asList("Group A", "Group B"));
}
}

Conducting experiment
• Experiment = “If” statement in the code
final String result =
laboratory.conductExperiment(key, fallback, new
StringConverter());
if (result.equals("group a"))
// execute group a's logic
else if (result.equals("group b"))
// execute group b's logic
// in case conducting the experiment failed -
the fallback value is returned
// in this case you would usually execute the
'old' logic

Upload spec
• Upload the specs to Petri server
○ Enables to define an experiment instance
{
"creationDate" : "2014-01-09T13:11:26.846Z",
"updateDate" : "2014-01-09T13:11:26.846Z",
"scopes" : [ {
"name" : "html-editor",
"onlyForLoggedInUsers" : true
}, {
"name" : "html-viewer",
"onlyForLoggedInUsers" : false
} ],
"testGroups" : [ "old", "new" ],
"persistent" : true,
"key" : "clientExperimentFullFlow1",
"owner" : ""
}

Start new experiment (limited population)

Ending successful experiment
1. Convert A/B Test to Feature Toggle (100% ON)
2. Merge the code
3. Close the experiment
4. Remove experiment instance

Experiment lifecycle
• Define spec
• Use Petri client to conduct experiment in
the code (defaults to old)
• Sync spec
• Open experiment
• Manage experiment state
• End experiment

Petri is more than just an A/B test
framework
Feature toggle
A/B Test
Internal testing
Personalization
Continuous
deployment
Jira integration
Experiments
Dynamic
configuration
QA
Automated
testing

Other things we (will) do with Petri
• Expose features internally to company employees
• Enable continuous deployment with feature toggles
• Select assignment by sites (not only by users)
• Automatic selection of winning group*
• Exposing feature to #n of users*
• Integration with Jira
* Planned feature

Petri is now an open source project
https://github.com/wix/petri

Q&A
http://goo.gl/L7pHnd
https://github.com/wix/petri
Aviran Mordo
Head of Back-end Engineering
@aviranm
www.linkedin.com/in/aviran
www.aviransplace.com
Sagy Rozman
Back-end Guild master
@sagyrozman
www.linkedin.com/in/sagyrozman

Credits
http://upload.wikimedia.org/wikipedia/commons/b/b2/Fiber_optics_testing.jpg
http://goo.gl/nEiepT
https://www.flickr.com/photos/ilo_oli/2421536836
https://www.flickr.com/photos/dexxus/5791228117
http://goo.gl/SdeJ0o
https://www.flickr.com/photos/112923805@N05/15005456062
https://www.flickr.com/photos/wiertz/8537791164
https://www.flickr.com/photos/laenulfean/5943132296
https://www.flickr.com/photos/torek/3470257377
https://www.flickr.com/photos/i5design/5393934753
https://www.flickr.com/photos/argonavigo/5320119828

Why Petri
• Modeled experiment lifecycle
• Open source (developed using TDD from day 1)
• Running at scale on production
• No deployment necessary
• Both back-end and front-end experiment
• Flexible architecture

PERTI Server
Your app
Laboratory
DB
Logs

Advanced A/B Testing at Wix - Aviran Mordo and Sagy Rozman, Wix.com

More Related Content

What's hot

Similar to Advanced A/B Testing at Wix - Aviran Mordo and Sagy Rozman, Wix.com

More from DevOpsDays Tel Aviv

Recently uploaded

Advanced A/B Testing at Wix - Aviran Mordo and Sagy Rozman, Wix.com