Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end" - Eugene Klyuchnikov

Building the future
of experiential travel
Johannes Reck
Serving A/B experimentation
platform end-to-end
Eugene Klyuchnikov

Europe’s largest marketplace
for travel experiences
50k+
Products in 150+
countries
25M+
Tickets sold
$650M+
In VC funding
600+
Strong global team
150+
Traveler nationalities

We make it simple to book and enjoy
incredible experiences

4
To validate UX changes
To estimate the effect
To understand what our
customers like
To be more objective
Because we can!
Because correlation
is not causation
Why to run A/B tests?

5
Correlation is not causation
- seasonality?
- marketing
effect?
- random
fluctuations?
random factors
and 3rd party
effects are
eliminated

6
Architecture
Applications
Application B
.
.
.
Application Z
“Raw” events Enriched events
Application A
enrichment job
- filters out office IPs,
- filters out bots and crawlers,
- detects suspicious behavior,
- etc.
A/B experiments
summary
experiment summary job
- calculates all relevant metrics for all
active experiments
- performs cumulative summarization

7
Challenge #1
Applications
● Events are not being sent
● Wrong events are sent
● Events miss some critical information
● Completely imbalanced assignment
due to technical issues
Application B
.
.
.
Application Z
Application A
A/B experiments
summary
Early
monitoring

8
Challenge #1
● Kibana
● Don’t care about slight imbalance
● Near real-time monitoring
● All environments
● Immediate feedback for developers

9
Challenge #2
Applications
● Imbalanced behavior (too many bots,
redirects, etc. on one variation or user group)
● Unreasonably low / high number of visitors
● Suspicious behavior
● Bizarre funnels
Application B
.
.
.
Application Z
Application A
A/B experiments
summary
Experiment
planning
Early
analysis

10
Challenge #2
● Looker + common sense
● Number of visitors should match the plan
● Share of total visitors should be stable
● Sometimes cohort analysis

11
Challenge #3
Applications
● Statistically imbalanced assignments
(sometimes small)
● Non-converging / suspicious uplifts
● Significant changes in the funnel
● Money burn
Application B
.
.
.
Application Z
Application A
A/B experiments
summary
Daily
monitoring
Automatic
alerts

12
Challenge #3
● Historical uplift (convergence)
● Assignment balance (chi-sq. test)
● Switchers below the threshold
● Money impact is acceptable
● Guardrail metrics feel good

13
Grey area #1
Grey Area #1
● event naming conventions
● event firing conventions
● timing conventions
● event containers
● on- / off-boarding events
● etc. etc.
● Defining the standards
● Regular syncs
● Training
● Documentation

14
Grey area #2
Grey Area #2
● rules for stopping experiment
● interpreting the results
● understanding funnel impact
● multidirectional metrics
● multiple comparisons problem
(the dead salmon syndrome)
● etc. etc.
● Defining the standards
● Regular syncs
● Training
● Documentation

15
Experiment tooling from end to end
Plan
experiment
Sample size tool
● Estimate the
duration of an
experiment
● Understand the
impact of limiting
to certain
segments on run
time
Dig deeper
Experiment funnel
analysis
● Configure a
funnel and see if
an experiment
had a significant
impact on any of
the steps
● Explore from here
to add more filters
on funnel steps
Get a team
overview
Team experiment
overview
● See all currently
active trials per
team and their
impact
● Estimate the
overall
test-over-test
contribution of
experiments to CR
Analyze results
Experiment dashboard
● See the impact of
an experiment on
success and
support metric
● See the remaining
run time till uplift
detection
Monitor
assignment
Kibana dashboard
● See the number of
events in near real
time
● Check the
assignment
balance between
A and B
● Estimate the
duration of an
experiment
● Understand the
impact of limiting
to certain
segments on run
time
● See the number of
events in near real
time
● Check the
assignment
balance between
A and B

16
Experiment analysis
● experiment metadata
● observed vs. expected assignment
● assignment balance
● switchers

17
Experiment analysis
● success and support metrics
● confidence level
● variation per variation comparison

18
Experiment analysis
● 5 guardrail metrics
● statistical significance is not evaluated
● just to make sure nothing is broken

19
Experiment analysis
● historical daily uplift
● daily visitors (absolute and %)
● easy to follow the trend

20
Experiment analysis
● team / company overview
● cumulative impact on CR
● experiments timeline

21
Q&A
Eugene Klyuchnikov
https://www.linkedin.com/in/eugene-klyuchnikov/

Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end" - Eugene Klyuchnikov

Recommended

Recommended

More Related Content

Similar to Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end" - Eugene Klyuchnikov

Similar to Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end" - Eugene Klyuchnikov (20)

More from Dataconomy Media

More from Dataconomy Media (20)

Recently uploaded

Recently uploaded (20)

Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end" - Eugene Klyuchnikov